We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers with 32Gb of RAM, and in each case puppet is taking longer and longer to run, as we have it control more. Currently it''s taking up to 20 minutes to perform a run. What approaches can I take to significantly reduce the time it takes puppet to run? It''s ALSO sucking up an inordinate amount of CPU while it performs a run. The server is using passenger. Doug -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 9:58 AM, Douglas Garstang <doug.garstang@gmail.com> wrote:> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers > with 32Gb of RAM, and in each case puppet is taking longer and longer > to run, as we have it control more. Currently it''s taking up to 20 > minutes to perform a run. > > What approaches can I take to significantly reduce the time it takes > puppet to run? It''s ALSO sucking up an inordinate amount of CPU while > it performs a run. The server is using passenger.What Ruby version are you running ? Do you have storeconfigs on? How have you configured passenger? Upgrading to 0.25.4 on your server and clients will improve file transfers, and significantly reduce memory consumption, but CPU usage will still be high in my experience.> > Doug > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. > >-- nigel -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 12:58 PM, Douglas Garstang <doug.garstang@gmail.com> wrote:> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers > with 32Gb of RAM, and in each case puppet is taking longer and longer > to run, as we have it control more. Currently it''s taking up to 20 > minutes to perform a run. > > What approaches can I take to significantly reduce the time it takes > puppet to run? It''s ALSO sucking up an inordinate amount of CPU while > it performs a run. The server is using passenger. > > Doug >Some more information about details on what you are managing would be helpful here. One open bug for instance is about updating Puppet to batch yum transactions, which can speed up run time. If you are seeing very long runs on the client, that can be a factor. On the server side, there are various things you might want to do, such as selectively updating certain servers at a time (i.e. using puppetrun) or setting up different schedules for different machines so they don''t hit all at once (such as using cron). We''re also working to reduce server load via cached catalogs and so forth, though it ultimately does depend a bit about how much you are managing per server. --Michael> -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On 10/03/10 18:58, Douglas Garstang wrote:> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers > with 32Gb of RAM, and in each case puppet is taking longer and longer > to run, as we have it control more. Currently it''s taking up to 20 > minutes to perform a run. > > What approaches can I take to significantly reduce the time it takes > puppet to run? It''s ALSO sucking up an inordinate amount of CPU while > it performs a run. The server is using passenger.Where do you experience the issue: on the clients or on the master? 0.25 highly improved the master performance and file serving. High cpu usage on the client is highly dependent on what you are managing (ie most of the time is usually spent in other processes than puppet, like package manager). Something that also can stress clients is managing deep file hierarchies. For high cpu usage on the master, you can try to: * disable storeconfigs or use thin_storeconfigs (0.25) * make sure your clients sleep longer than the default or use splay times so they don''t ask their catalogs at the same time * use a different ruby interpreter, and/or passenger * if you''re doing tons of file serving, offload those to a static server (see my last blog article in my signature). This will free your masters to serve more catalogs per unit of time. Hope that helps, -- Brice Figureau My Blog: http://www.masterzen.fr/ -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 10:06 AM, Nigel Kersten <nigelk@google.com> wrote:> On Wed, Mar 10, 2010 at 9:58 AM, Douglas Garstang > <doug.garstang@gmail.com> wrote: >> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >> with 32Gb of RAM, and in each case puppet is taking longer and longer >> to run, as we have it control more. Currently it''s taking up to 20 >> minutes to perform a run. >> >> What approaches can I take to significantly reduce the time it takes >> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >> it performs a run. The server is using passenger. > > What Ruby version are you running ? > Do you have storeconfigs on? > How have you configured passenger?Ruby version, on client and server is: ruby 1.8.5 (2006-08-25) [x86_64-linux] We aren''t using storeconfigs... I think the idea of putting puppet config in a db stupid, because you lose your ability to revision control your changes. I configured passenger as per: http://reductivelabs.com/trac/puppet/wiki/UsingPassenger> > Upgrading to 0.25.4 on your server and clients will improve file > transfers, and significantly reduce memory consumption, but CPU usage > will still be high in my experience.Until I know for sure that 0.25.4 will fix the performance problems, given that I''ve had all sorts of problems with 0.25.x in the past (as it relates to SSL keys), I really don''t want to do that. I can''t take that risk. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 10:18 AM, Brice Figureau <brice-puppet@daysofwonder.com> wrote:> On 10/03/10 18:58, Douglas Garstang wrote: >> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >> with 32Gb of RAM, and in each case puppet is taking longer and longer >> to run, as we have it control more. Currently it''s taking up to 20 >> minutes to perform a run. >> >> What approaches can I take to significantly reduce the time it takes >> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >> it performs a run. The server is using passenger. > > Where do you experience the issue: on the clients or on the master? > 0.25 highly improved the master performance and file serving.The issue is on the clients. The master seems fine. I''d like to avoid 0.25 for now, as I simply could not get the SSL keys to work with it the last time I tried and I can''t risk production seems not being able to receive updates for days on end.> > High cpu usage on the client is highly dependent on what you are > managing (ie most of the time is usually spent in other processes than > puppet, like package manager). Something that also can stress clients is > managing deep file hierarchies.We probably have some deep file hierarchies.> > For high cpu usage on the master, you can try to: > * disable storeconfigs or use thin_storeconfigs (0.25) > * make sure your clients sleep longer than the default or use splay > times so they don''t ask their catalogs at the same time > * use a different ruby interpreter, and/or passenger > * if you''re doing tons of file serving, offload those to a static > server (see my last blog article in my signature). This will free your > masters to serve more catalogs per unit of time.The main isssue isn''t even really the high CPU usage... it''s just that the client takes 20 minutes to run. That''s the really inconvenient bit. We aren''t using storeconfigs. Putting config in a db is crazy. We only have a total of maybe a dozen machines running the client, so I doubt increasing the time between runs will make any difference. We ARE using passenger on the server (said that in my original post). Not doing tons of file serving... the master is not working anywhere near as hard as the clients. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 10:19 AM, Douglas Garstang <doug.garstang@gmail.com> wrote:> On Wed, Mar 10, 2010 at 10:06 AM, Nigel Kersten <nigelk@google.com> wrote: >> On Wed, Mar 10, 2010 at 9:58 AM, Douglas Garstang >> <doug.garstang@gmail.com> wrote: >>> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >>> with 32Gb of RAM, and in each case puppet is taking longer and longer >>> to run, as we have it control more. Currently it''s taking up to 20 >>> minutes to perform a run. >>> >>> What approaches can I take to significantly reduce the time it takes >>> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >>> it performs a run. The server is using passenger. >> >> What Ruby version are you running ? >> Do you have storeconfigs on? >> How have you configured passenger? > > Ruby version, on client and server is: > ruby 1.8.5 (2006-08-25) [x86_64-linux]You should see significant improvements if you move to a more recent Ruby stack. A simple test is http://www.rubyenterpriseedition.com as you can install to /opt and not interfere with your current stack or have to work on packaging while you just evaluate it. I have it all packaged for debian now, but I used to simply symlink puppet/facter etc from the normal ruby lib into the Ruby EE one.> > We aren''t using storeconfigs... I think the idea of putting puppet > config in a db stupid, because you lose your ability to revision > control your changes.That''s not all it can do, but that''s somewhat irrelevant.> > I configured passenger as per: > http://reductivelabs.com/trac/puppet/wiki/UsingPassengerI have this config for 4 VCPUs and 4GB RAM: <IfModule mod_passenger.c> PassengerMaxRequests 5500 PassengerPoolIdleTime 600 PassengerMaxPoolSize 10 PassengerStatThrottleRate 600 </IfModule> MaxRequests isn''t so necessary with 0.25, but definitely stops memory leaks. what do your machines look like when they''re busy? Are all cores maxed out? uptime/load stats? memory consumption?> >> >> Upgrading to 0.25.4 on your server and clients will improve file >> transfers, and significantly reduce memory consumption, but CPU usage >> will still be high in my experience. > > Until I know for sure that 0.25.4 will fix the performance problems, > given that I''ve had all sorts of problems with 0.25.x in the past (as > it relates to SSL keys), I really don''t want to do that. I can''t take > that risk.No-one knows for sure whether 0.25.4 will fix your specific issues. You don''t have a development environment you can test on?> > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. > >-- nigel -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 10:24 AM, Douglas Garstang <doug.garstang@gmail.com> wrote:> On Wed, Mar 10, 2010 at 10:18 AM, Brice Figureau > <brice-puppet@daysofwonder.com> wrote: >> On 10/03/10 18:58, Douglas Garstang wrote: >>> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >>> with 32Gb of RAM, and in each case puppet is taking longer and longer >>> to run, as we have it control more. Currently it''s taking up to 20 >>> minutes to perform a run. >>> >>> What approaches can I take to significantly reduce the time it takes >>> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >>> it performs a run. The server is using passenger. >> >> Where do you experience the issue: on the clients or on the master? >> 0.25 highly improved the master performance and file serving. > > The issue is on the clients. The master seems fine. I''d like to avoid > 0.25 for now, as I simply could not get the SSL keys to work with it > the last time I tried and I can''t risk production seems not being able > to receive updates for days on end.If the issue is on the clients, then ignore everything I said above :) Are you sure the server isn''t a bottleneck though? Does it look overloaded?> >> >> High cpu usage on the client is highly dependent on what you are >> managing (ie most of the time is usually spent in other processes than >> puppet, like package manager). Something that also can stress clients is >> managing deep file hierarchies. > > We probably have some deep file hierarchies. > >> >> For high cpu usage on the master, you can try to: >> * disable storeconfigs or use thin_storeconfigs (0.25) >> * make sure your clients sleep longer than the default or use splay >> times so they don''t ask their catalogs at the same time >> * use a different ruby interpreter, and/or passenger >> * if you''re doing tons of file serving, offload those to a static >> server (see my last blog article in my signature). This will free your >> masters to serve more catalogs per unit of time. > > The main isssue isn''t even really the high CPU usage... it''s just that > the client takes 20 minutes to run. That''s the really inconvenient > bit. We aren''t using storeconfigs. Putting config in a db is crazy. We > only have a total of maybe a dozen machines running the client, so I > doubt increasing the time between runs will make any difference. We > ARE using passenger on the server (said that in my original post). Not > doing tons of file serving... the master is not working anywhere near > as hard as the clients. > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. > >-- nigel -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
> We aren''t using storeconfigs... I think the idea of putting puppet > config in a db stupid, because you lose your ability to revision > control your changes.Just on a technical note, this is not what storeconfigs are about so much. It also gives you information such as the current value of facts on each node (such as giving you a simple inventory system) The puppet manifest still drives the system, and can still be version controlled. --Michael -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 10:25 AM, Nigel Kersten <nigelk@google.com> wrote:> On Wed, Mar 10, 2010 at 10:19 AM, Douglas Garstang > <doug.garstang@gmail.com> wrote: >> On Wed, Mar 10, 2010 at 10:06 AM, Nigel Kersten <nigelk@google.com> wrote: >>> On Wed, Mar 10, 2010 at 9:58 AM, Douglas Garstang >>> <doug.garstang@gmail.com> wrote: >>>> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >>>> with 32Gb of RAM, and in each case puppet is taking longer and longer >>>> to run, as we have it control more. Currently it''s taking up to 20 >>>> minutes to perform a run. >>>> >>>> What approaches can I take to significantly reduce the time it takes >>>> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >>>> it performs a run. The server is using passenger. >>> >>> What Ruby version are you running ? >>> Do you have storeconfigs on? >>> How have you configured passenger? >> >> Ruby version, on client and server is: >> ruby 1.8.5 (2006-08-25) [x86_64-linux] > > You should see significant improvements if you move to a more recent Ruby stack. > > A simple test is http://www.rubyenterpriseedition.com as you can > install to /opt and not interfere with your current stack or have to > work on packaging while you just evaluate it. > > I have it all packaged for debian now, but I used to simply symlink > puppet/facter etc from the normal ruby lib into the Ruby EE one. > >> >> We aren''t using storeconfigs... I think the idea of putting puppet >> config in a db stupid, because you lose your ability to revision >> control your changes. > > That''s not all it can do, but that''s somewhat irrelevant. > >> >> I configured passenger as per: >> http://reductivelabs.com/trac/puppet/wiki/UsingPassenger > > I have this config for 4 VCPUs and 4GB RAM: > > <IfModule mod_passenger.c> > PassengerMaxRequests 5500 > PassengerPoolIdleTime 600 > PassengerMaxPoolSize 10 > PassengerStatThrottleRate 600 > </IfModule> > > MaxRequests isn''t so necessary with 0.25, but definitely stops memory leaks. > > what do your machines look like when they''re busy? Are all cores maxed > out? uptime/load stats? memory consumption?Thanks Nigel. Let me go check this stuff. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On 10/03/10 19:24, Douglas Garstang wrote:> On Wed, Mar 10, 2010 at 10:18 AM, Brice Figureau > <brice-puppet@daysofwonder.com> wrote: >> On 10/03/10 18:58, Douglas Garstang wrote: >>> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >>> with 32Gb of RAM, and in each case puppet is taking longer and longer >>> to run, as we have it control more. Currently it''s taking up to 20 >>> minutes to perform a run. >>> >>> What approaches can I take to significantly reduce the time it takes >>> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >>> it performs a run. The server is using passenger. >> >> Where do you experience the issue: on the clients or on the master? >> 0.25 highly improved the master performance and file serving. > > The issue is on the clients. The master seems fine. I''d like to avoid > 0.25 for now, as I simply could not get the SSL keys to work with it > the last time I tried and I can''t risk production seems not being able > to receive updates for days on end. > >> >> High cpu usage on the client is highly dependent on what you are >> managing (ie most of the time is usually spent in other processes than >> puppet, like package manager). Something that also can stress clients is >> managing deep file hierarchies. > > We probably have some deep file hierarchies.Try to comment those in your manifests, just to see if that is the root cause. Also make sure you don''t have a default like this: File { checksum => md5 } or other checksum in your non-sourced/non-content file{} resource. Because that means all your local not sourced file management will have to md5 every managed file. If you combine this with deep hierarchies and recursion, then you''ll have some CPU consumption troubles. If you have those recursive not sourced file {} resources, make sure to use: checksum => undef in them to at least not md5 everything. I think this problem doesn''t exist in 0.25.>> >> For high cpu usage on the master, you can try to: >> * disable storeconfigs or use thin_storeconfigs (0.25) >> * make sure your clients sleep longer than the default or use splay >> times so they don''t ask their catalogs at the same time >> * use a different ruby interpreter, and/or passenger >> * if you''re doing tons of file serving, offload those to a static >> server (see my last blog article in my signature). This will free your >> masters to serve more catalogs per unit of time. > > The main isssue isn''t even really the high CPU usage... it''s just that > the client takes 20 minutes to run. That''s the really inconvenient > bit. We aren''t using storeconfigs. Putting config in a db is crazy. We > only have a total of maybe a dozen machines running the client, so I > doubt increasing the time between runs will make any difference. We > ARE using passenger on the server (said that in my original post). Not > doing tons of file serving... the master is not working anywhere near > as hard as the clients.You can ignore my above advice since only your clients are affected. -- Brice Figureau My Blog: http://www.masterzen.fr/ -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 11:41 AM, Brice Figureau <brice-puppet@daysofwonder.com> wrote:> On 10/03/10 19:24, Douglas Garstang wrote: >> On Wed, Mar 10, 2010 at 10:18 AM, Brice Figureau >> <brice-puppet@daysofwonder.com> wrote: >>> On 10/03/10 18:58, Douglas Garstang wrote: >>>> We have puppet 0.24.8 running on multiple EIGHT core 3.16Ghz servers >>>> with 32Gb of RAM, and in each case puppet is taking longer and longer >>>> to run, as we have it control more. Currently it''s taking up to 20 >>>> minutes to perform a run. >>>> >>>> What approaches can I take to significantly reduce the time it takes >>>> puppet to run? It''s ALSO sucking up an inordinate amount of CPU while >>>> it performs a run. The server is using passenger. >>> >>> Where do you experience the issue: on the clients or on the master? >>> 0.25 highly improved the master performance and file serving. >> >> The issue is on the clients. The master seems fine. I''d like to avoid >> 0.25 for now, as I simply could not get the SSL keys to work with it >> the last time I tried and I can''t risk production seems not being able >> to receive updates for days on end. >> >>> >>> High cpu usage on the client is highly dependent on what you are >>> managing (ie most of the time is usually spent in other processes than >>> puppet, like package manager). Something that also can stress clients is >>> managing deep file hierarchies. >> >> We probably have some deep file hierarchies. > > Try to comment those in your manifests, just to see if that is the root > cause. > Also make sure you don''t have a default like this: > File { checksum => md5 } > or other checksum in your non-sourced/non-content file{} resource. > > Because that means all your local not sourced file management will have > to md5 every managed file. If you combine this with deep hierarchies and > recursion, then you''ll have some CPU consumption troubles. > If you have those recursive not sourced file {} resources, make sure to use: > > checksum => undef > > in them to at least not md5 everything. > I think this problem doesn''t exist in 0.25.Thanks. Checked and files are NOT being checksummed. Doug. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
So, it became apparent to me, after emailing someone off list, that managing a lot of files in deep directory structures might be part of the cause. We are running 10 instances of JBOSS and 10 instances of tomcat on each of these servers. Don''t ask me why, it''s just the way it was done before I arrived and changing it is not trivial. On disk, each instance of JBOSS starts at /opt/jboss/current/server/tfelN (where N is the instance number) and each instance of tomcat starts at: /opt/tomcat/tfelN/starterkit/current (where N is the instance number) I manually looked through the puppet config and counted 25 unique files that are being managed for jboss and tomcat within these paths. If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) 500 unique files that are being managed in these deep directory structures. Could that potentially be the reason behind puppets crap performance? Doug. -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On 10/03/10 22:06, Douglas Garstang wrote:> So, it became apparent to me, after emailing someone off list, that > managing a lot of files in deep directory structures might be part of > the cause. > > We are running 10 instances of JBOSS and 10 instances of tomcat on > each of these servers. Don''t ask me why, it''s just the way it was done > before I arrived and changing it is not trivial. > > On disk, each instance of JBOSS starts at > /opt/jboss/current/server/tfelN (where N is the instance number) > > and each instance of tomcat starts at: > /opt/tomcat/tfelN/starterkit/current (where N is the instance number)Do you source the whole hierarchy? Or do you only manage it?> I manually looked through the puppet config and counted 25 unique > files that are being managed for jboss and tomcat within these paths. > If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) > 500 unique files that are being managed in these deep directory > structures. Could that potentially be the reason behind puppets crap > performance?What do you manage for those files? But no, 500 doesn''t seem like a high number to me. You mentioned in another e-mail in this thread that the problem is more the 20 minutes run than the CPU. Could it be possible you have many "slow" execs? Or you manage many packages? This also reminds me Ohad''s bug: http://projects.reductivelabs.com/issues/1719 At this stage you should probably run puppetd on the console in --debug to see what happens (and run with --summarize too) and if it stalls. -- Brice Figureau My Blog: http://www.masterzen.fr/ -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau <brice-puppet@daysofwonder.com> wrote:> On 10/03/10 22:06, Douglas Garstang wrote: >> So, it became apparent to me, after emailing someone off list, that >> managing a lot of files in deep directory structures might be part of >> the cause. >> >> We are running 10 instances of JBOSS and 10 instances of tomcat on >> each of these servers. Don''t ask me why, it''s just the way it was done >> before I arrived and changing it is not trivial. >> >> On disk, each instance of JBOSS starts at >> /opt/jboss/current/server/tfelN (where N is the instance number) >> >> and each instance of tomcat starts at: >> /opt/tomcat/tfelN/starterkit/current (where N is the instance number) > > Do you source the whole hierarchy? > Or do you only manage it? > >> I manually looked through the puppet config and counted 25 unique >> files that are being managed for jboss and tomcat within these paths. >> If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) >> 500 unique files that are being managed in these deep directory >> structures. Could that potentially be the reason behind puppets crap >> performance? > > What do you manage for those files? > But no, 500 doesn''t seem like a high number to me. > > You mentioned in another e-mail in this thread that the problem is more > the 20 minutes run than the CPU. > Could it be possible you have many "slow" execs? > Or you manage many packages? > > This also reminds me Ohad''s bug: > http://projects.reductivelabs.com/issues/1719 > > At this stage you should probably run puppetd on the console in --debug > to see what happens (and run with --summarize too) and if it stalls.I just ran puppet in debug mode and it was obvious that most of the puppet run time was spent in checksumming files. Eg: debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties]: Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5 ... takes a long time to run. Multiple that by several hundred files... However, when I run this on the command line: md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties ... the result is instananeous... So... is puppet using a ruby library for performing md5 checksums? Is that where the performance bottle neck could be? Doug -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 1:34 PM, Douglas Garstang <doug.garstang@gmail.com> wrote:> On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau > <brice-puppet@daysofwonder.com> wrote: >> On 10/03/10 22:06, Douglas Garstang wrote: >>> So, it became apparent to me, after emailing someone off list, that >>> managing a lot of files in deep directory structures might be part of >>> the cause. >>> >>> We are running 10 instances of JBOSS and 10 instances of tomcat on >>> each of these servers. Don''t ask me why, it''s just the way it was done >>> before I arrived and changing it is not trivial. >>> >>> On disk, each instance of JBOSS starts at >>> /opt/jboss/current/server/tfelN (where N is the instance number) >>> >>> and each instance of tomcat starts at: >>> /opt/tomcat/tfelN/starterkit/current (where N is the instance number) >> >> Do you source the whole hierarchy? >> Or do you only manage it? >> >>> I manually looked through the puppet config and counted 25 unique >>> files that are being managed for jboss and tomcat within these paths. >>> If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) >>> 500 unique files that are being managed in these deep directory >>> structures. Could that potentially be the reason behind puppets crap >>> performance? >> >> What do you manage for those files? >> But no, 500 doesn''t seem like a high number to me. >> >> You mentioned in another e-mail in this thread that the problem is more >> the 20 minutes run than the CPU. >> Could it be possible you have many "slow" execs? >> Or you manage many packages? >> >> This also reminds me Ohad''s bug: >> http://projects.reductivelabs.com/issues/1719 >> >> At this stage you should probably run puppetd on the console in --debug >> to see what happens (and run with --summarize too) and if it stalls. > > I just ran puppet in debug mode and it was obvious that most of the > puppet run time was spent in checksumming files. > > Eg: > > debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties]: > Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5 > > ... takes a long time to run. Multiple that by several hundred files... > > However, when I run this on the command line: > md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties > > ... the result is instananeous... So... is puppet using a ruby library > for performing md5 checksums? Is that where the performance bottle > neck could be? > > Doug >Also... I just grabbed an example online of performing an md5 checksum on a file in ruby. Ran it on the same file above. Result was instananeous... the question remains... what is puppet doing??? Doug -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Mar 10, 2010 at 1:38 PM, Douglas Garstang <doug.garstang@gmail.com> wrote:> On Wed, Mar 10, 2010 at 1:34 PM, Douglas Garstang > <doug.garstang@gmail.com> wrote: >> On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau >> <brice-puppet@daysofwonder.com> wrote: >>> On 10/03/10 22:06, Douglas Garstang wrote: >>>> So, it became apparent to me, after emailing someone off list, that >>>> managing a lot of files in deep directory structures might be part of >>>> the cause. >>>> >>>> We are running 10 instances of JBOSS and 10 instances of tomcat on >>>> each of these servers. Don''t ask me why, it''s just the way it was done >>>> before I arrived and changing it is not trivial. >>>> >>>> On disk, each instance of JBOSS starts at >>>> /opt/jboss/current/server/tfelN (where N is the instance number) >>>> >>>> and each instance of tomcat starts at: >>>> /opt/tomcat/tfelN/starterkit/current (where N is the instance number) >>> >>> Do you source the whole hierarchy? >>> Or do you only manage it? >>> >>>> I manually looked through the puppet config and counted 25 unique >>>> files that are being managed for jboss and tomcat within these paths. >>>> If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) >>>> 500 unique files that are being managed in these deep directory >>>> structures. Could that potentially be the reason behind puppets crap >>>> performance? >>> >>> What do you manage for those files? >>> But no, 500 doesn''t seem like a high number to me. >>> >>> You mentioned in another e-mail in this thread that the problem is more >>> the 20 minutes run than the CPU. >>> Could it be possible you have many "slow" execs? >>> Or you manage many packages? >>> >>> This also reminds me Ohad''s bug: >>> http://projects.reductivelabs.com/issues/1719 >>> >>> At this stage you should probably run puppetd on the console in --debug >>> to see what happens (and run with --summarize too) and if it stalls. >> >> I just ran puppet in debug mode and it was obvious that most of the >> puppet run time was spent in checksumming files. >> >> Eg: >> >> debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties]: >> Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5 >> >> ... takes a long time to run. Multiple that by several hundred files... >> >> However, when I run this on the command line: >> md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties >> >> ... the result is instananeous... So... is puppet using a ruby library >> for performing md5 checksums? Is that where the performance bottle >> neck could be? >> >> DougJeez..... it went quiet in this thread didn''t it... -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, 2010-03-10 at 17:18 -0800, Douglas Garstang wrote:> On Wed, Mar 10, 2010 at 1:38 PM, Douglas Garstang > <doug.garstang@gmail.com> wrote: > > On Wed, Mar 10, 2010 at 1:34 PM, Douglas Garstang > > <doug.garstang@gmail.com> wrote: > >> On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau > >> <brice-puppet@daysofwonder.com> wrote: > >>> On 10/03/10 22:06, Douglas Garstang wrote: > >>>> So, it became apparent to me, after emailing someone off list, that > >>>> managing a lot of files in deep directory structures might be part of > >>>> the cause. > >>>> > >>>> We are running 10 instances of JBOSS and 10 instances of tomcat on > >>>> each of these servers. Don''t ask me why, it''s just the way it was done > >>>> before I arrived and changing it is not trivial. > >>>> > >>>> On disk, each instance of JBOSS starts at > >>>> /opt/jboss/current/server/tfelN (where N is the instance number) > >>>> > >>>> and each instance of tomcat starts at: > >>>> /opt/tomcat/tfelN/starterkit/current (where N is the instance number) > >>> > >>> Do you source the whole hierarchy? > >>> Or do you only manage it? > >>> > >>>> I manually looked through the puppet config and counted 25 unique > >>>> files that are being managed for jboss and tomcat within these paths. > >>>> If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) > >>>> 500 unique files that are being managed in these deep directory > >>>> structures. Could that potentially be the reason behind puppets crap > >>>> performance? > >>> > >>> What do you manage for those files? > >>> But no, 500 doesn''t seem like a high number to me. > >>> > >>> You mentioned in another e-mail in this thread that the problem is more > >>> the 20 minutes run than the CPU. > >>> Could it be possible you have many "slow" execs? > >>> Or you manage many packages? > >>> > >>> This also reminds me Ohad''s bug: > >>> http://projects.reductivelabs.com/issues/1719 > >>> > >>> At this stage you should probably run puppetd on the console in --debug > >>> to see what happens (and run with --summarize too) and if it stalls. > >> > >> I just ran puppet in debug mode and it was obvious that most of the > >> puppet run time was spent in checksumming files. > >> > >> Eg: > >> > >> debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties]: > >> Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5 > >> > >> ... takes a long time to run. Multiple that by several hundred files...That''s particularly strange. Note that when puppet displays this, the checksum has already been computed. So if you see a "downtime" after it prints a checksum change and before the next, it is well possible Puppet is sourcing the file from the master. In which case the bottleneck would be the fileserving and not the checksumming. Note again, if the file is not sourced, nor a template you don''t need to checksum it. You need to give us more information about those files (ie size, sourced/template?...) Can you run with --summarize, puppet will display in what operation it is taking time.> >> However, when I run this on the command line: > >> md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/rewrite.properties > >> > >> ... the result is instananeous... So... is puppet using a ruby library > >> for performing md5 checksums? Is that where the performance bottle > >> neck could be?I don''t know.> >> Doug > > Jeez..... it went quiet in this thread didn''t it...I''m not in your timezone if that''s what you were asking. -- Brice Figureau Follow the latest Puppet Community evolutions on www.planetpuppet.org! -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Mar 10, 2010, at 10:24 AM, Douglas Garstang wrote:> We aren''t using storeconfigs. Putting config in a db is crazy.I think there''s some confusion here on what storeconfigs does for you - you''re just caching the compiled configurations in a database, nothing else really changes. -- A government that robs Peter to pay Paul can always depend on the support of Paul. -- George Bernard Shaw --------------------------------------------------------------------- Luke Kanies -|- http://reductivelabs.com -|- +1(615)594-8199 -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Mar 10, 2010, at 1:38 PM, Douglas Garstang wrote:> On Wed, Mar 10, 2010 at 1:34 PM, Douglas Garstang > <doug.garstang@gmail.com> wrote: >> On Wed, Mar 10, 2010 at 1:17 PM, Brice Figureau >> <brice-puppet@daysofwonder.com> wrote: >>> On 10/03/10 22:06, Douglas Garstang wrote: >>>> So, it became apparent to me, after emailing someone off list, that >>>> managing a lot of files in deep directory structures might be >>>> part of >>>> the cause. >>>> >>>> We are running 10 instances of JBOSS and 10 instances of tomcat on >>>> each of these servers. Don''t ask me why, it''s just the way it was >>>> done >>>> before I arrived and changing it is not trivial. >>>> >>>> On disk, each instance of JBOSS starts at >>>> /opt/jboss/current/server/tfelN (where N is the instance number) >>>> >>>> and each instance of tomcat starts at: >>>> /opt/tomcat/tfelN/starterkit/current (where N is the instance >>>> number) >>> >>> Do you source the whole hierarchy? >>> Or do you only manage it? >>> >>>> I manually looked through the puppet config and counted 25 unique >>>> files that are being managed for jboss and tomcat within these >>>> paths. >>>> If you do the math, 25 x 10 x 2 = 500. That''s therefore (currently) >>>> 500 unique files that are being managed in these deep directory >>>> structures. Could that potentially be the reason behind puppets >>>> crap >>>> performance? >>> >>> What do you manage for those files? >>> But no, 500 doesn''t seem like a high number to me. >>> >>> You mentioned in another e-mail in this thread that the problem is >>> more >>> the 20 minutes run than the CPU. >>> Could it be possible you have many "slow" execs? >>> Or you manage many packages? >>> >>> This also reminds me Ohad''s bug: >>> http://projects.reductivelabs.com/issues/1719 >>> >>> At this stage you should probably run puppetd on the console in -- >>> debug >>> to see what happens (and run with --summarize too) and if it stalls. >> >> I just ran puppet in debug mode and it was obvious that most of the >> puppet run time was spent in checksumming files. >> >> Eg: >> >> debug: //Node[app01.fr.xxx.com]/Jboss::Instance[tfel8]/File[/opt/ >> jboss/current/server/tfel8/conf/jboss.web/localhost/ >> rewrite.properties]: >> Creating checksum {md5}f5d16bcc20b92631eb59514018fd34e5 >> >> ... takes a long time to run. Multiple that by several hundred >> files... >> >> However, when I run this on the command line: >> md5sum /opt/jboss/current/server/tfel8/conf/jboss.web/localhost/ >> rewrite.properties >> >> ... the result is instananeous... So... is puppet using a ruby >> library >> for performing md5 checksums? Is that where the performance bottle >> neck could be? >> >> Doug >> > > Also... > > I just grabbed an example online of performing an md5 checksum on a > file in ruby. > Ran it on the same file above. > Result was instananeous... the question remains... what is puppet > doing???The short answer is, more than md5sum is. All you''re seeing is the log message, you don''t really know if that''s what''s taking all of the time. We''ve always known about the performance problems of using Puppet to manage large file heirarchies, which is why we generally recommend you don''t do it unless you''ve tested that it works for your use cases. I''ve basically only ever seen two non-pathological cases where client runs take a long time: Either you''re using a lot of yum (which we''ve mostly resolved), or you''re doing a lot of large file heirarchies (or a few very large ones). You can look at the reports coming out of your systems to see where time is being spent, and that should tell you almost immediately. It already looks like it''s files, so I''d start by trying to trim back recursion where you can, and try not to manage large files where you can avoid it. -- Silence is a text easy to misread. -- A. A. Attanasio, ''The Eagle and the Sword'' --------------------------------------------------------------------- Luke Kanies -|- http://reductivelabs.com -|- +1(615)594-8199 -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.