Howdy: Does anybody else see in storeconfigs spikes *after* you''ve been up and running with storeconfigs for a while? Twice in the past month our puppetmaster has been slammed by storeconfigs activity. We''re running 25b2 but not (yet) puppetqd. Our mysql questions, com_select and com_insert stats spike first. com_select and com_update are normally at around 5 and spike to 40; questions is normally around 150 and spikes to 600. Threads connected goes from around 15 to 30. After that it looks like everything queues up behind MySQL and we start getting timeouts on our ~450 clients. The storm lasts less than an hour. Of course nothing special is going on with our clients (that we know of!) when the storm hits. I *think* but am not positive that our first storm happened while we were running mod_proxy + Mongrel. Our second happened with Passenger. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Tue, 2009-07-28 at 08:50 -0400, Mark Plaksin wrote:> Howdy: > > Does anybody else see in storeconfigs spikes *after* you''ve been up and > running with storeconfigs for a while? Twice in the past month our > puppetmaster has been slammed by storeconfigs activity. We''re running > 25b2 but not (yet) puppetqd. > > Our mysql questions, com_select and com_insert stats spike first. > com_select and com_update are normally at around 5 and spike to 40; > questions is normally around 150 and spikes to 600. Threads connected > goes from around 15 to 30. After that it looks like everything queues > up behind MySQL and we start getting timeouts on our ~450 clients. > > The storm lasts less than an hour. > > Of course nothing special is going on with our clients (that we know > of!) when the storm hits. I *think* but am not positive that our first > storm happened while we were running mod_proxy + Mongrel. Our second > happened with Passenger.The only reason for a storeconfig storm is that Puppet deletes all the resources/tags belonging to a particular host and then recreates them, so you see a lots of Inserts. Now the real question is why Puppet thinks there is such discrepencies between the database and the live compilation. Are you sure you''re not removing hosts from the database? What would be interesting is to activate the mysql general query log (warning it will increase your load), and dig in the large log around the timeframe you see the storm (you can also activate the rails log for the same effect). Or I remember reading that maatkit now contains a query log extractor from tcpdump captures files; it is worth capturing the traffic between Puppet and mysql and analyze the queries performed. Maybe you''ll find the issue. Good luck :-) -- Brice Figureau My Blog: http://www.masterzen.fr/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Brice Figureau <brice-puppet@daysofwonder.com> writes:>> Our mysql questions, com_select and com_insert stats spike first. >> com_select and com_update are normally at around 5 and spike to 40; >> questions is normally around 150 and spikes to 600. Threads connected >> goes from around 15 to 30. After that it looks like everything queues >> up behind MySQL and we start getting timeouts on our ~450 clients....> The only reason for a storeconfig storm is that Puppet deletes all the > resources/tags belonging to a particular host and then recreates them, > so you see a lots of Inserts. > > Now the real question is why Puppet thinks there is such discrepencies > between the database and the live compilation. > > Are you sure you''re not removing hosts from the database?Yes. Hosts that no longer exist are still in the database :)> What would be interesting is to activate the mysql general query log > (warning it will increase your load), and dig in the large log around > the timeframe you see the storm (you can also activate the rails log for > the same effect).I meant to ask whether some MySQL expert could look at our binary logs and figure out what happened :) Oh, I see there''s a mysqlbinlog command! Who knew? Some quick greps of its output say the total number of updates and inserts from yesterday is about the same as any other day. Same for various hours yesterday--the hour that we got slammed doesn''t seem to have more updates or inserts than other hours when we didn''t get slammed.> Or I remember reading that maatkit now contains a query log extractor > from tcpdump captures files; it is worth capturing the traffic between > Puppet and mysql and analyze the queries performed. Maybe you''ll find > the issue.Maybe there''s a tool which reads binary logs and tells you what caused the storm :)> Good luck :-)Heh, thanks :) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Tue, 2009-07-28 at 21:09 -0400, Mark Plaksin wrote:> Brice Figureau <brice-puppet@daysofwonder.com> writes: > > >> Our mysql questions, com_select and com_insert stats spike first. > >> com_select and com_update are normally at around 5 and spike to 40; > >> questions is normally around 150 and spikes to 600. Threads connected > >> goes from around 15 to 30. After that it looks like everything queues > >> up behind MySQL and we start getting timeouts on our ~450 clients. > > ... > > > The only reason for a storeconfig storm is that Puppet deletes all the > > resources/tags belonging to a particular host and then recreates them, > > so you see a lots of Inserts. > > > > Now the real question is why Puppet thinks there is such discrepencies > > between the database and the live compilation. > > > > Are you sure you''re not removing hosts from the database? > > Yes. Hosts that no longer exist are still in the database :)One thing I noticed is the following: I had in two separate places (modules) the following pattern: if ! defined(File["a"]) { file { "a": ... } } Then when a host was coming to get its config in puppetmaster, it could get the File[a] defined from place1. If it was connected to another master process, then the same host could get the File[a] from place2. It isn''t an issue per se, but that means that this resource has possibly too different tag set (one mentioning place1, the other place2). So each time, the tags for this resource was deleted and recreated, generating database load. Maybe you have such pattern in your manifests?> > What would be interesting is to activate the mysql general query log > > (warning it will increase your load), and dig in the large log around > > the timeframe you see the storm (you can also activate the rails log for > > the same effect). > > I meant to ask whether some MySQL expert could look at our binary logs > and figure out what happened :) Oh, I see there''s a mysqlbinlog command! > Who knew? Some quick greps of its output say the total number of > updates and inserts from yesterday is about the same as any other day. > Same for various hours yesterday--the hour that we got slammed doesn''t > seem to have more updates or inserts than other hours when we didn''t get > slammed.The binlog contains only write queries (ie INSERT, UPDATE, DELETE), so you''re only seeing a part of the story. It''s easy to trigger a "storm", if you have a few queries that takes a long time. But maybe the cause is external to MySQL. I have seen this kind of stuff happen when: * I/O degrades because another process is eating all the available disks throughput (usually backup processes or snapshotting) * I/O degrades because the machine is swapping (something eating memory), either because the swap area is on the same disks as the mysql data, or simply because mysql innodb buffer pool is being swapped in & out. What I suggest you is if it happens next time, is to use innotop and have a look to the live running queries. You might find that you have one or more "slow" queries.> > Or I remember reading that maatkit now contains a query log extractor > > from tcpdump captures files; it is worth capturing the traffic between > > Puppet and mysql and analyze the queries performed. Maybe you''ll find > > the issue. > > Maybe there''s a tool which reads binary logs and tells you what caused > the storm :)If only :-) -- Brice Figureau My Blog: http://www.masterzen.fr/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Brice Figureau wrote:> What I suggest you is if it happens next time, is to use innotop and > have a look to the live running queries. You might find that you have > one or more "slow" queries. >The slow query log might be a better alternative. -- Trevor Hemsley Infrastructure Engineer ................................................. * C A L Y P S O * Brighton, UK OFFICE +44 (0) 1273 666 350 FAX +44 (0) 1273 666 351 ................................................. www.calypso.com This electronic-mail might contain confidential information intended only for the use by the entity named. If the reader of this message is not the intended recipient, the reader is hereby notified that any dissemination, distribution or copying is strictly prohibited. * P * /*/ Please consider the environment before printing this e-mail /*/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
On Wed, 2009-07-29 at 09:56 +0100, Trevor Hemsley wrote:> Brice Figureau wrote: > > What I suggest you is if it happens next time, is to use innotop and > > have a look to the live running queries. You might find that you have > > one or more "slow" queries. > > > The slow query log might be a better alternative.Of course, but it needs to be activated, and to activate it you must restart mysql which might not be an option, hence my suggestion of viewing live traffic with innotop (did I say this tool is very good). -- Brice Figureau My Blog: http://www.masterzen.fr/ --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---
Brice Figureau <brice-puppet@daysofwonder.com> writes:> One thing I noticed is the following: I had in two separate places > (modules) the following pattern: > > if ! defined(File["a"]) { > file { "a": > ... > } > } > > Then when a host was coming to get its config in puppetmaster, it could > get the File[a] defined from place1. If it was connected to another > master process, then the same host could get the File[a] from place2. > It isn''t an issue per se, but that means that this resource has possibly > too different tag set (one mentioning place1, the other place2). > So each time, the tags for this resource was deleted and recreated, > generating database load. > > Maybe you have such pattern in your manifests?I don''t think we have this. If we did wouldn''t it be generating the extra load all the time instead of twice a month (our sample size is still just 2 so twice a month might not be quite right )> But maybe the cause is external to MySQL. I have seen this kind of stuff > happen when: > * I/O degrades because another process is eating all the available > disks throughput (usually backup processes or snapshotting) > * I/O degrades because the machine is swapping (something eating > memory), either because the swap area is on the same disks as the mysql > data, or simply because mysql innodb buffer pool is being swapped in & > out.This box is dedicated to puppetmaster and MySQL and never has iowait so I don''t think this is it.> What I suggest you is if it happens next time, is to use innotop and > have a look to the live running queries. You might find that you have > one or more "slow" queries.I installed innotop. Of course it says everything''s bored right now :) I assume that when there''s a slow query it sticks around on the screen? Everything''s so fast now that it disappears before I can read the SQL. I also installed maatkit. Hopefully we''ll learn something during the next storm! Thanks for the help! --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en -~----------~----~----~----~------~----~------~--~---