Peter Meier
2008-May-18 11:29 UTC
[Puppet Users] connection timeout / memory usage / locks / recompiling
Hi I have created a module for djbdns [1], which will generate the data file (dns entries) from defines in puppet classes. This is all fine and working. :) However I autogenerated after a few tests the classes for the domains from the current source of our domains. Which are about 50 classes each calling around 3 defines, which will be concatenated_file calls (from davids common module), which are in the end around 350 files (done by the file-resource and using the content parameter), often containing only a single line. After having all these classes I tried to run puppet once with puppetd --test on the client which then should get all these classes down to client. After running it the first time, I got a timeout, I remembered that there was a fix in the upcoming release, which will respect the timeouts set by config. Applied the patch from the ticket #1176 [2] on the client and rerun the client with puppetd --test. No it was running quite longer :) BUT it is still failling: time puppetd --test info: Loading fact acpi_available info: Loading fact interfaces info: Loading fact netmask info: Loading fact virtual info: Loading fact configured_ntp_servers info: Loading fact selinux info: Retrieving plugins notice: Ignoring cache err: Could not call puppetmaster.getconfig: #<RuntimeError: HTTP-Error: 504 Gateway Time-out> err: Could not retrieve catalog: HTTP-Error: 504 Gateway Time-out warning: Not using cache on failed catalog real 4m23.383s user 1m10.460s sys 3m8.400s I''m using the setup with nginx described on the wiki [3] I then tried to adjust ssl_session_timout and ssl_session_cache with giving them 10m more. However this didn''t change anything. One run was aborted after the same amount of time. I realized then on the master that it used a hell lot more of memory and that it was nearly by trashing. So I examined more the master and saw, that one of the 4 puppetmaster-daemons (the one which the problemhost connected to) had a fast growing memory usage during the problemhost was connected, at the end it had a bit more than 30% memory usage (xen instance has about 1.4G RAM, 3G swap, and 2.33GHz cpu, which isn''t really used by something else). After the client disconnected with the above error, I could see that the process was still running high on cpu and even consuming more memory, but it grew then only about 3%. I also the could see strange behaviour in the logs: Sun May 18 10:05:30 +0200 2008 Puppet (notice): Compiled catalog for host1 in 18.92 seconds Sun May 18 10:08:52 +0200 2008 Puppet (notice): Compiled catalog for puppetmaster in 5.92 seconds Sun May 18 10:10:01 +0200 2008 Puppet (notice): Compiled catalog for host2 in 4.06 seconds Sun May 18 10:14:59 +0200 2008 Puppet (err): Could not store configs: Mysql::Error: Lock wait timeout exceeded; try restarting transac tion: DELETE FROM `param_values` WHERE `id` = 4880339 Sun May 18 10:14:59 +0200 2008 Puppet (notice): Compiled catalog for problemhost in 157.61 seconds Sun May 18 10:17:35 +0200 2008 Puppet (err): Could not store configs: Mysql::Error: Lock wait timeout exceeded; try restarting transac tion: DELETE FROM `param_values` WHERE `id` = 4880339 Sun May 18 10:17:37 +0200 2008 Puppet (notice): Compiled catalog for problemhost in 255.58 seconds Sun May 18 10:27:42 +0200 2008 Puppet (err): Could not store configs: Mysql::Error: Lock wait timeout exceeded; try restarting transac tion: DELETE FROM `param_values` WHERE `id` = 4880339 Sun May 18 10:27:42 +0200 2008 Puppet (notice): Compiled catalog for problemhost in 800.52 seconds Sun May 18 10:41:32 +0200 2008 Puppet (notice): Compiled catalog for problemhost in 1809.78 seconds Sun May 18 10:44:20 +0200 2008 Puppet (notice): Compiled catalog for host2 in 108.01 seconds Sun May 18 10:47:06 +0200 2008 Puppet (err): no classes for this kernel yet defined! at /srv/puppet/development/modules/ntp/manifests/ init.pp:21 on node host2 Sun May 18 10:47:06 +0200 2008 Puppet (err): no classes for this kernel yet defined! at /srv/puppet/development/modules/ntp/manifests/ init.pp:21 on node host2 Sun May 18 10:47:17 +0200 2008 Puppet (err): no classes for this kernel yet defined! at /srv/puppet/development/modules/ntp/manifests/ init.pp:21 on node host2 Sun May 18 10:47:17 +0200 2008 Puppet (err): no classes for this kernel yet defined! at /srv/puppet/development/modules/ntp/manifests/ init.pp:21 on node host2 Sun May 18 10:48:05 +0200 2008 Puppet (notice): Compiled catalog for host2 in 201.16 seconds Sun May 18 10:48:10 +0200 2008 Puppet (notice): Compiled catalog for puppetmaster in 206.64 seconds Sun May 18 10:54:44 +0200 2008 Puppet (notice): Compiled catalog for puppetmaster in 722.00 seconds Sun May 18 10:55:06 +0200 2008 Puppet (notice): Compiled catalog for puppetmaster in 691.28 seconds Sun May 18 11:17:32 +0200 2008 Puppet (notice): Caught TERM; shutting down So if I start the puppetd --test (after a previous restart of the master) only _once_, I can see the following behaviour: 1. puppetmaster''s memory is growing very fast and very quick, up to a not yet known level. 2. client (problemhost in the logs) exits with a gateway timeout, which can''t be adjusted by reconfiguring nginx. (or did I miss anything? maybe a firewall issue?) 3. puppetmaster seems still to be compiling the client''s manifests. 4. puppetmaster is getting some locking issues with the mysql db 5. puppetmaster is _recompiling_ the client''s manifests several times without having the client reconnecting again. 6. compiling of other hosts take afterwards 10 times as before and they start complaining about things which are simply not true: the ntp error is a defined fail if the found kernel by facter is not linux or openbsd (i didn''t configure ntp yet on any other system) and this worked before as you can see in the logs! I assume this is realted to some corrupted memory? 7. client''s manifest compiling will still take ages until i restart the master. So in my opinion there seems to be still some issues with memory or whatever. Actually I don''t see any misconfiguration on my site, nor are the generation of that much files in my opinion really a problem? Or are they and this is the same as with serving huge and a lot of files (I thought that maybe serving and using the content param are different) and people should wait until the REST support is here and then ideas like these will work? ;) Or are there any other ideas what could be wrong and how I could fix this issue? Ah and I use 0.24.4 on all clients, as well the master. The master has the latest nagios_commands improvements applied (for some other host) and the problemclient has the timeout patch applied. thanks a lot and greets pete. [1] http://github.com/duritong/puppet-djbdns/tree/master [2] http://reductivelabs.com/trac/puppet/ticket/1176 [3] http://reductivelabs.com/trac/puppet/wiki/UsingMongrelNginx --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Luke Kanies
2008-May-20 03:25 UTC
[Puppet Users] Re: connection timeout / memory usage / locks / recompiling
On May 18, 2008, at 6:29 AM, Peter Meier wrote:> So in my opinion there seems to be still some issues with memory or > whatever. Actually I don''t see any misconfiguration on my site, nor > are > the generation of that much files in my opinion really a problem? Or > are > they and this is the same as with serving huge and a lot of files (I > thought that maybe serving and using the content param are different) > and people should wait until the REST support is here and then ideas > like these will work? ;)I can''t disagree with your conclusion -- there''s a problem there. The next step is to figure out what''s causing the problem and fix it. -- Most people are born and years later die without really having lived at all. They play it safe and tiptoe through life with no aspiration other than to arrive at death safely. -- Tony Campolo, "Carpe Diem" --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Peter Meier
2008-May-22 10:26 UTC
[Puppet Users] Re: connection timeout / memory usage / locks / recompiling
Hi> I can''t disagree with your conclusion -- there''s a problem there. > > The next step is to figure out what''s causing the problem and fix it.could somebody point me what these steps would be? I''d like to do it and nail down the problem. I would have time now to investigate on that problem. However another idea is that the ruby on centos5 is quite old and might be related to the problems. And remembering that @ work we had some weird similar problems with an own ruby script, which disappeared after upgrading ruby to 1.9 this might be also worth a try? greets Pete --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Luke Kanies
2008-May-22 14:51 UTC
[Puppet Users] Re: connection timeout / memory usage / locks / recompiling
On May 22, 2008, at 5:26 AM, Peter Meier wrote:> > Hi > >> I can''t disagree with your conclusion -- there''s a problem there. >> >> The next step is to figure out what''s causing the problem and fix it. > > could somebody point me what these steps would be? I''d like to do it > and > nail down the problem. I would have time now to investigate on that > problem. > > However another idea is that the ruby on centos5 is quite old and > might > be related to the problems. And remembering that @ work we had some > weird similar problems with an own ruby script, which disappeared > after > upgrading ruby to 1.9 this might be also worth a try?An old ruby could cause problems, definitely, but... I haven''t heard of anyone running Puppet with 1.9 (is it even officially out yet?). Catch me next week, maybe on IRC, and I''ll be in a better position to help you track it down, hopefully. -- All power corrupts, but we need the electricity. -- Unknown --------------------------------------------------------------------- Luke Kanies | http://reductivelabs.com | http://madstop.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Mark Foster
2008-May-22 15:10 UTC
[Puppet Users] Re: connection timeout / memory usage / locks / recompiling
Luke Kanies wrote:> An old ruby could cause problems, definitely, but... I haven''t heard > of anyone running Puppet with 1.9 (is it even officially out yet?). > >Sorta related... Is anyone running puppet on ruby v1.6 or will this work? That is what comes stock with Mac OS X 10.3. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Adam Jacob
2008-May-22 16:13 UTC
[Puppet Users] Re: connection timeout / memory usage / locks / recompiling
On Thu, May 22, 2008 at 8:10 AM, Mark Foster <mfoster@bitpusher.com> wrote:> Sorta related... Is anyone running puppet on ruby v1.6 or will this work? > That is what comes stock with Mac OS X 10.3.Ruby 1.6 is fine. Adam -- HJK Solutions - We Launch Startups - http://www.hjksolutions.com Adam Jacob, Senior Partner T: (206) 508-4759 E: adam@hjksolutions.com --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Peter Meier
2008-May-23 13:18 UTC
[Puppet Users] Re: connection timeout / memory usage / locks / recompiling
Hi> An old ruby could cause problems, definitely, but... I haven''t heard > of anyone running Puppet with 1.9 (is it even officially out yet?).nope, we took the vanilla one.> Catch me next week, maybe on IRC, and I''ll be in a better position to > help you track it down, hopefully.ok I''ll do so around thursday. thanks and greets pete --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users-unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Possibly Parallel Threads
- Unable to import a manifest file from a different directory to the one where site.pp is located using environments
- Puppet kick class option error
- something wrong with mongrel?
- puppet-users-br err: Could not retrieve catalog from remote server: SSL_connect returned=1 errno=0 state=unknown sta
- puppet client server connection refused when I use puppet kick