Hello all, I''m playing with a new puppet setup (2.6.1 on debian stable with ruby 1.8) and I''ve run into a problem with puppetd consuming ridiculous amounts of memory. My setup isn''t that complex (apache2 configuration and a few other packages/services), but puppetd will force the machine into swapping. It seems to be CPU bound; if I run strace -e ''!rt_sigprocmask'' on it, I can see that the only system calls it makes are brk() calls. The system ends up swapping heavily and of course puppetd never makes progress. (On this VM with 512MB of RAM, puppetd ends up w/ an RSS that oscillates around 450MB, while having 596MB mapped). It seems like there''s a superlinear algorithm or some infinite loop that''s triggered by my configuration. FWIW, the catalog for this client is 3145 lines long and the server compiles it in 0.21 seconds. Now, this client was able to run the catalog just fine a few days ago, so I''m going to concentrate on my last set of changes, but I''m reporting this here as this seems like an obvious bug to me. Perhaps others have run into it before? Also, is there some mechanism to monitor what puppetd is doing that will provide me with more information than --debug? Thanks, Aggelos -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
You might want to try running the client with --evaltrace for additional information. Trevor On 10/5/10, Angelos Oikonomopoulos <angelos.oikonomopoulos@fp-commerce.de> wrote:> Hello all, > > I''m playing with a new puppet setup (2.6.1 on debian stable with ruby > 1.8) and I''ve run into a problem with puppetd consuming ridiculous > amounts of memory. > > My setup isn''t that complex (apache2 configuration and a few other > packages/services), but puppetd will force the machine into swapping. It > seems to be CPU bound; if I run strace -e ''!rt_sigprocmask'' on it, I can > see that the only system calls it makes are brk() calls. The system ends > up swapping heavily and of course puppetd never makes progress. (On this > VM with 512MB of RAM, puppetd ends up w/ an RSS that oscillates around > 450MB, while having 596MB mapped). > > It seems like there''s a superlinear algorithm or some infinite loop > that''s triggered by my configuration. FWIW, the catalog for this client > is 3145 lines long and the server compiles it in 0.21 seconds. > > Now, this client was able to run the catalog just fine a few days ago, > so I''m going to concentrate on my last set of changes, but I''m reporting > this here as this seems like an obvious bug to me. Perhaps others have > run into it before? > > Also, is there some mechanism to monitor what puppetd is doing that will > provide me with more information than --debug? > > Thanks, > Aggelos > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. > >-- Trevor Vaughan Vice President, Onyx Point, Inc (410) 541-6699 tvaughan@onyxpoint.com -- This account not approved for unencrypted proprietary information -- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
This is huge. Are you serving a lot of big files or templates? On Oct 5, 2010, at 4:01 AM, Angelos Oikonomopoulos wrote:> Hello all, > > I''m playing with a new puppet setup (2.6.1 on debian stable with ruby 1.8) and I''ve run into a problem with puppetd consuming ridiculous amounts of memory. > > My setup isn''t that complex (apache2 configuration and a few other packages/services), but puppetd will force the machine into swapping. It seems to be CPU bound; if I run strace -e ''!rt_sigprocmask'' on it, I can see that the only system calls it makes are brk() calls. The system ends up swapping heavily and of course puppetd never makes progress. (On this VM with 512MB of RAM, puppetd ends up w/ an RSS that oscillates around 450MB, while having 596MB mapped). > > It seems like there''s a superlinear algorithm or some infinite loop that''s triggered by my configuration. FWIW, the catalog for this client is 3145 lines long and the server compiles it in 0.21 seconds. > > Now, this client was able to run the catalog just fine a few days ago, so I''m going to concentrate on my last set of changes, but I''m reporting this here as this seems like an obvious bug to me. Perhaps others have run into it before? > > Also, is there some mechanism to monitor what puppetd is doing that will provide me with more information than --debug? > > Thanks, > Aggelos > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Angelos Oikonomopoulos
2010-Oct-06 09:00 UTC
Re: [Puppet Users] puppetd memory consumption
On 10/05/2010 06:58 PM, Trevor Vaughan wrote:> You might want to try running the client with --evaltrace for > additional information. >Thanks. This didn''t help much, as it''s only printing out the information after the resource evaluation, so I still have no idea what it''s doing when it gets stuck. However I''ve instrumented the catalog application with unroller (suggestions for more appropriate tools welcome) and I''m currently waiting for it to get to the interesting parts (unroller slows things down much). That said, I''ve also tried to figure out which part of my configuration causes the issue. It turns out that if I comment out the include of this class: http://paste.lisp.org/display/115229 in my node definition the problem goes away. The class definition is an exact copy of my actual definition, except for the s/<string>/foo/g. There is an interesting coincidence here; my node definition is basically node ''fqdn'' inherits managed_node { $openntpd_server = ''ntp.fpc.local'' include openntpd include foohier } where node managed_node { include util_packages ssh_authorized_key { "user@host": user => ''root'', ensure => present, type => ''ssh-rsa'', key => ''AAAA...=='', } user { "www-sync": ensure => present, allowdupe => false, uid => <numeric id>, } } and class util_packages { $packagelist = [8 packages with minimal dependencies] package { $packagelist: ensure => installed } } notice that, by mistake, I''m not requiring User["www-sync"] in foohier. The last line puppetd prints before it gets stuck is always: info: /Stage[main]//Node[managed_node]/User[www-sync]: Evaluated in 0.00 seconds I have no idea if this could be relevant. I very much expect this to be a particularly creative error on my part, but of course this failure mode is less than desirable :-) TIA for any hints, Aggelos -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Angelos Oikonomopoulos
2010-Oct-06 09:01 UTC
Re: [Puppet Users] puppetd memory consumption
On 10/06/2010 12:17 AM, Patrick wrote:> This is huge. Are you serving a lot of big files or templates? >Not at all. Aggelos -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
I''ve had that same issue with things hanging and having no idea whet they''re hanging on. Try popping open your YAML catalog in a text editor and see what comes right after the entry that''s listed last. I''ve had reasonable luck with figuring things out from that. On 10/6/10, Angelos Oikonomopoulos <angelos.oikonomopoulos@fp-commerce.de> wrote:> On 10/05/2010 06:58 PM, Trevor Vaughan wrote: > > Thanks. This didn''t help much, as it''s only printing out the information > after the resource evaluation, so I still have no idea what it''s doing > when it gets stuck. However I''ve instrumented the catalog application > with unroller (suggestions for more appropriate tools welcome) and I''m > currently waiting for it to get to the interesting parts (unroller slows > things down much). > > That said, I''ve also tried to figure out which part of my configuration > causes the issue.<snip/> -- Trevor Vaughan Vice President, Onyx Point, Inc (410) 541-6699 tvaughan@onyxpoint.com -- This account not approved for unencrypted proprietary information -- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Angelos Oikonomopoulos
2010-Oct-06 13:14 UTC
Re: [Puppet Users] puppetd memory consumption
On 10/06/2010 02:41 PM, Trevor Vaughan wrote:> I''ve had that same issue with things hanging and having no idea whet > they''re hanging on. > > Try popping open your YAML catalog in a text editor and see what comes > right after the entry that''s listed last.Hmm, this sounds like good advice, but is the order of the resources in the resource table the same as after the topological sort of the relationship graph (I have Puppet::Transaction.evaluate in mind here)? In any case, by /more/ selective commenting I''ve deduced that it''s the recurse => true in http://paste.lisp.org/display/115229 that triggers this. If I leave it out (or change it to false), puppet applies the catalog as expected. Can someone figure out what the actual issue is? Thanks, Aggelos -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
> In any case, by /more/ selective commenting I''ve deduced that it''s the > recurse => true in http://paste.lisp.org/display/115229 that triggers > this. If I leave it out (or change it to false), puppet applies the > catalog as expected. > > Can someone figure out what the actual issue is?about how many files are we talking under the hierarchy of the recurse file resources? and how big are they? ~pete -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
On Wed, Oct 06, 2010 at 03:18:44PM +0200, Peter Meier wrote:> > about how many files are we talking under the hierarchy of the recurse > file resources? and how big are they?And which filesystem type. Some are much more painful to walk than others. -- Bruce What would Edward Woodward do? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
> In any case, by /more/ selective commenting I''ve deduced that it''s the > recurse => true in http://paste.lisp.org/display/115229 that triggers > this. If I leave it out (or change it to false), puppet applies the > catalog as expected.Haven''t I hinted at you to refactor that in IRC? ;-) Seriously though, make sure this is not related to checksums. I''ve had a puppetd run hours when it tried to compute a completely unnecessary checksum for a large catalina.out file in a recurse => true directory. Numerous workarounds float around in the issue tracker, but I have found none to work in 0.25.5. I just evade recurse these days and use exec { "chmod" ... } instead. Regards, Felix -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Angelos Oikonomopoulos
2010-Oct-06 13:32 UTC
Re: [Puppet Users] puppetd memory consumption
On 10/06/2010 03:18 PM, Peter Meier wrote:>> In any case, by /more/ selective commenting I''ve deduced that it''s the >> recurse => true in http://paste.lisp.org/display/115229 that triggers >> this. If I leave it out (or change it to false), puppet applies the >> catalog as expected. >> >> Can someone figure out what the actual issue is? > > about how many files are we talking under the hierarchy of the recurse > file resources? and how big are they?18k+ files in 5k+ directories, adding up to 250MB of disk usage. However, as I''ve mentioned in my original mail, puppetd does not do anything other than allocate memory. The only system calls it does are rt_sigprocmask and brk(). Unless puppet loaded the whole thing into memory before getting stuck (which I did not consider as a possibility), I don''t think the directory contents should make a difference. Aggelos -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Angelos Oikonomopoulos
2010-Oct-06 13:37 UTC
Re: [Puppet Users] puppetd memory consumption
On 10/06/2010 03:24 PM, Felix Frank wrote:>> In any case, by /more/ selective commenting I''ve deduced that it''s the >> recurse => true in http://paste.lisp.org/display/115229 that triggers >> this. If I leave it out (or change it to false), puppet applies the >> catalog as expected. > > Haven''t I hinted at you to refactor that in IRC? ;-)First get it working, then you can improve on it :)> Seriously though, make sure this is not related to checksums. > I''ve had a puppetd run hours when it tried to compute a completely > unnecessary checksum for a large catalina.out file in a recurse => true > directory.Hrm. Interesting. I suppose it could have loaded the whole tree in memory and be crunching on it, but I would consider that counter-intuitive to say the least. I fail to see why it would have to calculate checksums to ensure file mode and ownership though.> Numerous workarounds float around in the issue tracker, but I have found > none to work in 0.25.5. I just evade recurse these days and use exec { > "chmod" ... } instead.Hmm, OK, I''ll keep searching for workarounds then. Thanks, Aggelos -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
>> Seriously though, make sure this is not related to checksums. >> I''ve had a puppetd run hours when it tried to compute a completely >> unnecessary checksum for a large catalina.out file in a recurse => true >> directory. > > Hrm. Interesting. I suppose it could have loaded the whole tree in > memory and be crunching on it, but I would consider that > counter-intuitive to say the least. > > I fail to see why it would have to calculate checksums to ensure file > mode and ownership though.It''s considered a bug, I think.>> Numerous workarounds float around in the issue tracker, but I have found >> none to work in 0.25.5. I just evade recurse these days and use exec { >> "chmod" ... } instead. > > Hmm, OK, I''ll keep searching for workarounds then.As I said, I''ve had no luck with those. Furthermore, regarding the analysis in your previous mail, I disbelieve you''re seeing this problem. The way I caught puppet redhanded and checksumming away *was* watching strace and what I saw were massive reads of the file being checksummed. So you''re onto something else, I guess. If the recurse is giving you trouble, steer clear of it for the time being. That would be the most simple workaround. Cheers, Felix -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Try setting the checksum to undef (from what I understood this was implemented in 2.6). How many files are in those folders? What order of magnitude? tens of thousands, millions, under 1000? Sidenote: The requires for parent folders are done automagically, you don''t need to specify them. Silviu On 06.10.2010 12:00, Angelos Oikonomopoulos wrote:> On 10/05/2010 06:58 PM, Trevor Vaughan wrote: >> You might want to try running the client with --evaltrace for >> additional information. > Thanks. This didn''t help much, as it''s only printing out the > information after the resource evaluation, so I still have no idea > what it''s doing when it gets stuck. However I''ve instrumented the > catalog application with unroller (suggestions for more appropriate > tools welcome) and I''m currently waiting for it to get to the > interesting parts (unroller slows things down much). > > That said, I''ve also tried to figure out which part of my > configuration causes the issue. It turns out that if I comment out the > include of this class: http://paste.lisp.org/display/115229 in my node > definition the problem goes away. The class definition is an exact > copy of my actual definition, except for the s/<string>/foo/g. There > is an interesting coincidence here; my node definition is basically > > node ''fqdn'' inherits managed_node { > $openntpd_server = ''ntp.fpc.local'' > include openntpd > include foohier > } > > where > > node managed_node { > include util_packages > ssh_authorized_key { "user@host": > user => ''root'', > ensure => present, > type => ''ssh-rsa'', > key => ''AAAA...=='', > } > user { "www-sync": > ensure => present, > allowdupe => false, > uid => <numeric id>, > } > } > > and > > class util_packages { > $packagelist = [8 packages with minimal dependencies] > package { $packagelist: > ensure => installed > } > } > > notice that, by mistake, I''m not requiring User["www-sync"] in > foohier. The last line puppetd prints before it gets stuck is always: > info: /Stage[main]//Node[managed_node]/User[www-sync]: Evaluated in > 0.00 seconds > > I have no idea if this could be relevant. I very much expect this to > be a particularly creative error on my part, but of course this > failure mode is less than desirable :-) > > TIA for any hints, > Aggelos >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Yeah, that''s too much for recurse. The issue is that Puppet will create an in-memory ''file'' object for each file and directory in the entire tree. It will then try to munge that whole messi into the in-memory catalog. The result is well...memory FAIL. In theory, some of this has been addressed in 2.6, but I haven''t tested that portion yet. There was a bit of discussion in the dev mailing list on how to handle this inside Puppet. For now, exec. Trevor On 10/6/10, Angelos Oikonomopoulos <angelos.oikonomopoulos@fp-commerce.de> wrote:> On 10/06/2010 03:18 PM, Peter Meier wrote: >>> In any case, by /more/ selective commenting I''ve deduced that it''s the >>> recurse => true in http://paste.lisp.org/display/115229 that triggers >>> this. If I leave it out (or change it to false), puppet applies the >>> catalog as expected. >>> >>> Can someone figure out what the actual issue is? >> >> about how many files are we talking under the hierarchy of the recurse >> file resources? and how big are they? > > 18k+ files in 5k+ directories, adding up to 250MB of disk usage. > > However, as I''ve mentioned in my original mail, puppetd does not do > anything other than allocate memory. The only system calls it does are > rt_sigprocmask and brk(). Unless puppet loaded the whole thing into > memory before getting stuck (which I did not consider as a possibility), > I don''t think the directory contents should make a difference. > > Aggelos > > -- > You received this message because you are subscribed to the Google Groups > "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. > >-- Trevor Vaughan Vice President, Onyx Point, Inc (410) 541-6699 tvaughan@onyxpoint.com -- This account not approved for unencrypted proprietary information -- -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.