Martin Willemsma
2012-Jun-07  06:11 UTC
[Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin
Hi, I deployed MCollective to our Puppet clients. approx. ~ 200. Our platform requires the most secure setup possible, so PSK as securityprovider is not an option. Therefor I changed the security provider to aes_security reusing puppets certificates in the server.cfg as found in the docs (1) Our goal is to use mcollective to offload event-driven actions to agents running on designated nodes from a webapplication. e.g: send out a message to the ''platform'' collective to create a DNS record. This message should be processed by a node that runs the ''DNS'' agent. One thing I noticed after switching to the aes_security plugin is the ping latency went up and a reply to an action does not come back from all the nodes. Were does this latency come from? If I do a mco ping on the client I expect: - every node to respond - show me the ---- ping statistics ---- in the end - jump back to my console ready for the next command but it does not. Instead it shows me the output for 207 nodes and then it just "HANGS" there. This output shows pingtimes hostnames omitted 1340.38 ms <- first reply 1406.25 ms 1456.71 ms 1508.19 ms 1550.52 ms 1576.07 ms 1601.15 ms 1627.40 ms 1653.23 ms 1678.26 ms [ .. omitted intentionally ] 7518.66 ms 7556.47 ms 7593.06 ms 7623.46 ms 7648.64 ms 7685.62 ms 7722.84 ms <- last reply I see on the client console If I check the the logfile on the client sending the command ''/var/log/mcollective.log'' the last few lines show me: D, [2012-06-07T07:39:46.470905 #15910] DEBUG -- : pluginmanager.rb:83:in `[]'' Returning cached plugin security_plugin with class MCollective::Security::Aes_security D, [2012-06-07T07:39:46.471029 #15910] DEBUG -- : aes_security.rb:202:in `deserialize'' De-Serializing using marshal D, [2012-06-07T07:39:46.471121 #15910] DEBUG -- : aes_security.rb:255:in `decrypt'' Decrypting message using private key D, [2012-06-07T07:39:46.495265 #15910] DEBUG -- : aes_security.rb:202:in `deserialize'' De-Serializing using marshal D, [2012-06-07T07:39:46.495711 #15910] DEBUG -- : stomp.rb:191:in `receive'' Waiting for a message from Stomp I can wait forever but it does not receive I use (control + break) to exit out ^C ---- ping statistics ---- 207 replies max: 6877.20 min: 616.98 avg: 3912.99 Logfile shows me: D, [2012-06-07T07:41:10.571316 #15910] DEBUG -- : client.rb:72:in `unsubscribe'' Unsubscribing reply target for discovery D, [2012-06-07T07:41:10.571496 #15910] DEBUG -- : pluginmanager.rb:83:in `[]'' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-06-07T07:41:10.571615 #15910] DEBUG -- : stomp.rb:257:in `unsubscribe'' Unsubscribing from /topic/mcollective.discovery.reply D, [2012-06-07T07:41:10.572767 #15910] DEBUG -- : pluginmanager.rb:83:in `[]'' Returning cached plugin connector_plugin with class MCollective::Connector::Stomp D, [2012-06-07T07:41:10.572849 #15910] DEBUG -- : stomp.rb:264:in `disconnect'' Disconnecting from Stomp Same behavior with using any of the other commands ''get_fact'' , ''rpc package'' ''rpc service''. I''m just not able to do a search over the collective when using the AES plugin. If I switch switch back to PSK replies are speedy and always come back. But then again this is not want. At first I was using RabbitMQ default config. I tries some tweaking but did not seem to make any difference to the behaviour of mco. I switched to ActiveMQ 5.6 with the configfiles from puppetlabs.git. Set it up according to the docs , again played with some setttings and did not do anything at all. tcpdumps show the node running the mcollective server responds to the message send from the mcollective client. But seconds after the node replies the output gets printed on the client. Somehow it looks like the message gets ''STUCK'' in the messagebus and arrives late on the client. Any hints on were to tackle this issue are more then welcome and really appreciated . This issue is blocking the implementation of mcollective on our platform which is more than just sad Currently I''m using MCollective 2.0.0 on Ubuntu 10.04 LTS X86_64. (1) http://docs.puppetlabs.com/mcollective/reference/plugins/security_aes.html --- Best regards, Martin -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
R.I.Pienaar
2012-Jun-07  08:40 UTC
Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin
----- Original Message -----> From: "Martin Willemsma" <mwillemsma@gmail.com> > To: "Puppet Users" <puppet-users@googlegroups.com> > Sent: Thursday, June 7, 2012 7:11:41 AM > Subject: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin > > Hi, > > I deployed MCollective to our Puppet clients. approx. ~ 200. Our > platform requires the most secure setup possible, so PSK as > securityprovider is not an option.I''d almost always suggest SSL TLS + the ssl plugin now.> Therefor I changed the security provider to aes_security reusing > puppets certificates in the server.cfg as found in the docs (1) > Our goal is to use mcollective to offload event-driven actions to > agents running on designated nodes from a webapplication. > > e.g: send out a message to the ''platform'' collective to create a DNS > record. This message should be processed by a node that runs the > ''DNS'' agent. > > One thing I noticed after switching to the aes_security plugin is the > ping latency went up and a reply to an action does not come back from > all the nodes. Were does this latency come from? > If I do a mco ping on the client I expect: > > - every node to respond > - show me the ---- ping statistics ---- in the end > - jump back to my console ready for the next command > > but it does not. Instead it shows me the output for 207 nodes and > then > it just "HANGS" there.> > This output shows pingtimes hostnames omitted > > 1340.38 ms <- first reply > 1406.25 ms > 1456.71 ms > 1508.19 ms > 1550.52 ms > 1576.07 ms > 1601.15 ms > 1627.40 ms > 1653.23 ms > 1678.26 ms > [ .. omitted intentionally ] > 7518.66 ms > 7556.47 ms > 7593.06 ms > 7623.46 ms > 7648.64 ms > 7685.62 ms > 7722.84 ms <- last reply I see on the client consoleThere are a few odd things here, the first reply is way too slow, the AES plugin is computationally very heavy and not suited for large deploys yours though is not large and even then the overhead is in the 30 to 40ms over that of the PSK plugin on first response - the effect snow balls but this should not be the performance I expect. Second the ''mco ping'' should not run indefinitely till you stop it, it should run for 5 seconds and then end, does yours do that with the PSK plugin active? Hard to guess what might be the underlying cause for the above combination of issues - could be a very slow machine as the mco client, could be issues on the network perhaps there are a lot of TCP rebroadcasts or something along those lines. On the machines that do not respond do you see anything in their logs - put them in debug and make sure they got the request and replied. Anything weird on your broker? Large CPU usage perhaps?> > > If I check the the logfile on the client sending the command > ''/var/log/mcollective.log'' the last few lines show me: > > D, [2012-06-07T07:39:46.470905 #15910] DEBUG -- : > pluginmanager.rb:83:in `[]'' Returning cached plugin security_plugin > with class MCollective::Security::Aes_security > D, [2012-06-07T07:39:46.471029 #15910] DEBUG -- : > aes_security.rb:202:in `deserialize'' De-Serializing using marshal > D, [2012-06-07T07:39:46.471121 #15910] DEBUG -- : > aes_security.rb:255:in `decrypt'' Decrypting message using private key > D, [2012-06-07T07:39:46.495265 #15910] DEBUG -- : > aes_security.rb:202:in `deserialize'' De-Serializing using marshal > D, [2012-06-07T07:39:46.495711 #15910] DEBUG -- : stomp.rb:191:in > `receive'' Waiting for a message from Stomp > > I can wait forever but it does not receive > I use (control + break) to exit out > > ^C > > ---- ping statistics ---- > 207 replies max: 6877.20 min: 616.98 avg: 3912.99 > > Logfile shows me: > > D, [2012-06-07T07:41:10.571316 #15910] DEBUG -- : client.rb:72:in > `unsubscribe'' Unsubscribing reply target for discovery > D, [2012-06-07T07:41:10.571496 #15910] DEBUG -- : > pluginmanager.rb:83:in `[]'' Returning cached plugin connector_plugin > with class MCollective::Connector::Stomp > D, [2012-06-07T07:41:10.571615 #15910] DEBUG -- : stomp.rb:257:in > `unsubscribe'' Unsubscribing from /topic/mcollective.discovery.reply > D, [2012-06-07T07:41:10.572767 #15910] DEBUG -- : > pluginmanager.rb:83:in `[]'' Returning cached plugin connector_plugin > with class MCollective::Connector::Stomp > D, [2012-06-07T07:41:10.572849 #15910] DEBUG -- : stomp.rb:264:in > `disconnect'' Disconnecting from Stomp > > Same behavior with using any of the other commands ''get_fact'' , ''rpc > package'' ''rpc service''. I''m just not able to do a search over the > collective when using the AES plugin. > > If I switch switch back to PSK replies are speedy and always come > back. But then again this is not want. > > At first I was using RabbitMQ default config. I tries some tweaking > but did not seem to make any difference to the behaviour of mco. I > switched to ActiveMQ 5.6 with the configfiles from puppetlabs.git. > Set > it up according to the docs , again played with some setttings and > did > not do anything at all. > > tcpdumps show the node running the mcollective server responds to the > message send from the mcollective client. But seconds after the node > replies the output gets printed on the client. Somehow it looks like > the message gets ''STUCK'' in the messagebus and arrives late on the > client. > > Any hints on were to tackle this issue are more then welcome and > really appreciated . This issue is blocking the implementation of > mcollective on our platform which is more than just sad > > Currently I''m using MCollective 2.0.0 on Ubuntu 10.04 LTS X86_64. > > (1) > http://docs.puppetlabs.com/mcollective/reference/plugins/security_aes.html > > --- > Best regards, > > Martin > > -- > You received this message because you are subscribed to the Google > Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Martin Willemsma
2012-Jun-08  06:11 UTC
Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin
Thanks for your response. One thing I noticed when using the PSK is that I indeed see the discovery with the progressbar. That''s one thing I never see when using AES. Commands always come back from discovered nodes when using PSK. You suggest SSL TLS, is that the same as AES provider i''m using right now? I run the client and on the same node as the rabbitmq. Also tried with an activemq installation on another node in the same subnet. Doesn''t seem to make any difference. The node running the messagebus is mostly idle. 4 CPUs / 4Gb ram and on the other node 2 CPU / 2 Gb ram. I also tried this the client on my workstation. I5 / 8 Gb ram / SSD disk, same behavior. I agree that the ping times are pretty high but I could live with that if at least all the replies came back. I spend quite some time making this work on our platform. I need to look more in-depth on the network part. 2012/6/7 R.I.Pienaar <rip@devco.net>:> > > ----- Original Message ----- >> From: "Martin Willemsma" <mwillemsma@gmail.com> >> To: "Puppet Users" <puppet-users@googlegroups.com> >> Sent: Thursday, June 7, 2012 7:11:41 AM >> Subject: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin >> >> Hi, >> >> I deployed MCollective to our Puppet clients. approx. ~ 200. Our >> platform requires the most secure setup possible, so PSK as >> securityprovider is not an option. > > I''d almost always suggest SSL TLS + the ssl plugin now. > >> Therefor I changed the security provider to aes_security reusing >> puppets certificates in the server.cfg as found in the docs (1) >> Our goal is to use mcollective to offload event-driven actions to >> agents running on designated nodes from a webapplication. >> >> e.g: send out a message to the ''platform'' collective to create a DNS >> record. This message should be processed by a node that runs the >> ''DNS'' agent. >> >> One thing I noticed after switching to the aes_security plugin is the >> ping latency went up and a reply to an action does not come back from >> all the nodes. Were does this latency come from? >> If I do a mco ping on the client I expect: >> >> - every node to respond >> - show me the ---- ping statistics ---- in the end >> - jump back to my console ready for the next command >> >> but it does not. Instead it shows me the output for 207 nodes and >> then >> it just "HANGS" there. > >> >> This output shows pingtimes hostnames omitted >> >> 1340.38 ms <- first reply >> 1406.25 ms >> 1456.71 ms >> 1508.19 ms >> 1550.52 ms >> 1576.07 ms >> 1601.15 ms >> 1627.40 ms >> 1653.23 ms >> 1678.26 ms >> [ .. omitted intentionally ] >> 7518.66 ms >> 7556.47 ms >> 7593.06 ms >> 7623.46 ms >> 7648.64 ms >> 7685.62 ms >> 7722.84 ms <- last reply I see on the client console > > > There are a few odd things here, the first reply is way too slow, the > AES plugin is computationally very heavy and not suited for large > deploys yours though is not large and even then the overhead is in the > 30 to 40ms over that of the PSK plugin on first response - the effect > snow balls but this should not be the performance I expect. > > Second the ''mco ping'' should not run indefinitely till you stop it, it > should run for 5 seconds and then end, does yours do that with the PSK > plugin active? > > Hard to guess what might be the underlying cause for the above > combination of issues - could be a very slow machine as the mco client, > could be issues on the network perhaps there are a lot of TCP > rebroadcasts or something along those lines. > > On the machines that do not respond do you see anything in their logs - > put them in debug and make sure they got the request and replied. > Anything weird on your broker? Large CPU usage perhaps? > > >> >> >> If I check the the logfile on the client sending the command >> ''/var/log/mcollective.log'' the last few lines show me: >> >> D, [2012-06-07T07:39:46.470905 #15910] DEBUG -- : >> pluginmanager.rb:83:in `[]'' Returning cached plugin security_plugin >> with class MCollective::Security::Aes_security >> D, [2012-06-07T07:39:46.471029 #15910] DEBUG -- : >> aes_security.rb:202:in `deserialize'' De-Serializing using marshal >> D, [2012-06-07T07:39:46.471121 #15910] DEBUG -- : >> aes_security.rb:255:in `decrypt'' Decrypting message using private key >> D, [2012-06-07T07:39:46.495265 #15910] DEBUG -- : >> aes_security.rb:202:in `deserialize'' De-Serializing using marshal >> D, [2012-06-07T07:39:46.495711 #15910] DEBUG -- : stomp.rb:191:in >> `receive'' Waiting for a message from Stomp >> >> I can wait forever but it does not receive >> I use (control + break) to exit out >> >> ^C >> >> ---- ping statistics ---- >> 207 replies max: 6877.20 min: 616.98 avg: 3912.99 >> >> Logfile shows me: >> >> D, [2012-06-07T07:41:10.571316 #15910] DEBUG -- : client.rb:72:in >> `unsubscribe'' Unsubscribing reply target for discovery >> D, [2012-06-07T07:41:10.571496 #15910] DEBUG -- : >> pluginmanager.rb:83:in `[]'' Returning cached plugin connector_plugin >> with class MCollective::Connector::Stomp >> D, [2012-06-07T07:41:10.571615 #15910] DEBUG -- : stomp.rb:257:in >> `unsubscribe'' Unsubscribing from /topic/mcollective.discovery.reply >> D, [2012-06-07T07:41:10.572767 #15910] DEBUG -- : >> pluginmanager.rb:83:in `[]'' Returning cached plugin connector_plugin >> with class MCollective::Connector::Stomp >> D, [2012-06-07T07:41:10.572849 #15910] DEBUG -- : stomp.rb:264:in >> `disconnect'' Disconnecting from Stomp >> >> Same behavior with using any of the other commands ''get_fact'' , ''rpc >> package'' ''rpc service''. I''m just not able to do a search over the >> collective when using the AES plugin. >> >> If I switch switch back to PSK replies are speedy and always come >> back. But then again this is not want. >> >> At first I was using RabbitMQ default config. I tries some tweaking >> but did not seem to make any difference to the behaviour of mco. I >> switched to ActiveMQ 5.6 with the configfiles from puppetlabs.git. >> Set >> it up according to the docs , again played with some setttings and >> did >> not do anything at all. >> >> tcpdumps show the node running the mcollective server responds to the >> message send from the mcollective client. But seconds after the node >> replies the output gets printed on the client. Somehow it looks like >> the message gets ''STUCK'' in the messagebus and arrives late on the >> client. >> >> Any hints on were to tackle this issue are more then welcome and >> really appreciated . This issue is blocking the implementation of >> mcollective on our platform which is more than just sad >> >> Currently I''m using MCollective 2.0.0 on Ubuntu 10.04 LTS X86_64. >> >> (1) >> http://docs.puppetlabs.com/mcollective/reference/plugins/security_aes.html >> >> --- >> Best regards, >> >> Martin >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Puppet Users" group. >> To post to this group, send email to puppet-users@googlegroups.com. >> To unsubscribe from this group, send email to >> puppet-users+unsubscribe@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/puppet-users?hl=en. >> >> > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. >-- --- Met vriendelijke groet, Martin Willemsma -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
R.I.Pienaar
2012-Jun-08  09:05 UTC
Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin
----- Original Message -----> From: "Martin Willemsma" <mwillemsma@gmail.com> > To: puppet-users@googlegroups.com > Sent: Friday, June 8, 2012 7:11:39 AM > Subject: Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin > > Thanks for your response. > One thing I noticed when using the PSK is that I indeed see the > discovery with the progressbar. That''s one thing I never see when > using AES. Commands always come back from discovered nodes when using > PSK. > > You suggest SSL TLS, is that the same as AES provider i''m using right > now?identity of the client is securely established and the payload is encrypted using industry standards, i guess it depends on your needs though> > I run the client and on the same node as the rabbitmq. Also tried > with an activemq installation on another node in the same subnet. > Doesn''t seem to make any difference. The node running the messagebus > is mostly idle. 4 CPUs / 4Gb ram and on the other node 2 CPU / 2 Gb > ram. I also tried this the client on my workstation. I5 / 8 Gb ram / > SSD disk, same behavior. > > I agree that the ping times are pretty high but I could live with > that if at least all the replies came back.ping times that long will just prevent everything from working. there''s a fundamental problem somewhere.> I spend quite some time making this work on our platform. I need to > look more in-depth on the network part. > > 2012/6/7 R.I.Pienaar <rip@devco.net>: > > > > > > ----- Original Message ----- > >> From: "Martin Willemsma" <mwillemsma@gmail.com> > >> To: "Puppet Users" <puppet-users@googlegroups.com> > >> Sent: Thursday, June 7, 2012 7:11:41 AM > >> Subject: [Puppet Users] MCollective not all nodes answer to > >> commands when using aes_security plugin > >> > >> Hi, > >> > >> I deployed MCollective to our Puppet clients. approx. ~ 200. Our > >> platform requires the most secure setup possible, so PSK as > >> securityprovider is not an option. > > > > I''d almost always suggest SSL TLS + the ssl plugin now. > > > >> Therefor I changed the security provider to aes_security reusing > >> puppets certificates in the server.cfg as found in the docs (1) > >> Our goal is to use mcollective to offload event-driven actions to > >> agents running on designated nodes from a webapplication. > >> > >> e.g: send out a message to the ''platform'' collective to create a > >> DNS > >> record. This message should be processed by a node that runs the > >> ''DNS'' agent. > >> > >> One thing I noticed after switching to the aes_security plugin is > >> the > >> ping latency went up and a reply to an action does not come back > >> from > >> all the nodes. Were does this latency come from? > >> If I do a mco ping on the client I expect: > >> > >> - every node to respond > >> - show me the ---- ping statistics ---- in the end > >> - jump back to my console ready for the next command > >> > >> but it does not. Instead it shows me the output for 207 nodes and > >> then > >> it just "HANGS" there. > > > >> > >> This output shows pingtimes hostnames omitted > >> > >> 1340.38 ms <- first reply > >> 1406.25 ms > >> 1456.71 ms > >> 1508.19 ms > >> 1550.52 ms > >> 1576.07 ms > >> 1601.15 ms > >> 1627.40 ms > >> 1653.23 ms > >> 1678.26 ms > >> [ .. omitted intentionally ] > >> 7518.66 ms > >> 7556.47 ms > >> 7593.06 ms > >> 7623.46 ms > >> 7648.64 ms > >> 7685.62 ms > >> 7722.84 ms <- last reply I see on the client console > > > > > > There are a few odd things here, the first reply is way too slow, > > the > > AES plugin is computationally very heavy and not suited for large > > deploys yours though is not large and even then the overhead is in > > the > > 30 to 40ms over that of the PSK plugin on first response - the > > effect > > snow balls but this should not be the performance I expect. > > > > Second the ''mco ping'' should not run indefinitely till you stop it, > > it > > should run for 5 seconds and then end, does yours do that with the > > PSK > > plugin active? > > > > Hard to guess what might be the underlying cause for the above > > combination of issues - could be a very slow machine as the mco > > client, > > could be issues on the network perhaps there are a lot of TCP > > rebroadcasts or something along those lines. > > > > On the machines that do not respond do you see anything in their > > logs - > > put them in debug and make sure they got the request and replied. > > Anything weird on your broker? Large CPU usage perhaps? > > > > > >> > >> > >> If I check the the logfile on the client sending the command > >> ''/var/log/mcollective.log'' the last few lines show me: > >> > >> D, [2012-06-07T07:39:46.470905 #15910] DEBUG -- : > >> pluginmanager.rb:83:in `[]'' Returning cached plugin > >> security_plugin > >> with class MCollective::Security::Aes_security > >> D, [2012-06-07T07:39:46.471029 #15910] DEBUG -- : > >> aes_security.rb:202:in `deserialize'' De-Serializing using marshal > >> D, [2012-06-07T07:39:46.471121 #15910] DEBUG -- : > >> aes_security.rb:255:in `decrypt'' Decrypting message using private > >> key > >> D, [2012-06-07T07:39:46.495265 #15910] DEBUG -- : > >> aes_security.rb:202:in `deserialize'' De-Serializing using marshal > >> D, [2012-06-07T07:39:46.495711 #15910] DEBUG -- : stomp.rb:191:in > >> `receive'' Waiting for a message from Stomp > >> > >> I can wait forever but it does not receive > >> I use (control + break) to exit out > >> > >> ^C > >> > >> ---- ping statistics ---- > >> 207 replies max: 6877.20 min: 616.98 avg: 3912.99 > >> > >> Logfile shows me: > >> > >> D, [2012-06-07T07:41:10.571316 #15910] DEBUG -- : client.rb:72:in > >> `unsubscribe'' Unsubscribing reply target for discovery > >> D, [2012-06-07T07:41:10.571496 #15910] DEBUG -- : > >> pluginmanager.rb:83:in `[]'' Returning cached plugin > >> connector_plugin > >> with class MCollective::Connector::Stomp > >> D, [2012-06-07T07:41:10.571615 #15910] DEBUG -- : stomp.rb:257:in > >> `unsubscribe'' Unsubscribing from > >> /topic/mcollective.discovery.reply > >> D, [2012-06-07T07:41:10.572767 #15910] DEBUG -- : > >> pluginmanager.rb:83:in `[]'' Returning cached plugin > >> connector_plugin > >> with class MCollective::Connector::Stomp > >> D, [2012-06-07T07:41:10.572849 #15910] DEBUG -- : stomp.rb:264:in > >> `disconnect'' Disconnecting from Stomp > >> > >> Same behavior with using any of the other commands ''get_fact'' , > >> ''rpc > >> package'' ''rpc service''. I''m just not able to do a search over the > >> collective when using the AES plugin. > >> > >> If I switch switch back to PSK replies are speedy and always come > >> back. But then again this is not want. > >> > >> At first I was using RabbitMQ default config. I tries some > >> tweaking > >> but did not seem to make any difference to the behaviour of mco. I > >> switched to ActiveMQ 5.6 with the configfiles from puppetlabs.git. > >> Set > >> it up according to the docs , again played with some setttings and > >> did > >> not do anything at all. > >> > >> tcpdumps show the node running the mcollective server responds to > >> the > >> message send from the mcollective client. But seconds after the > >> node > >> replies the output gets printed on the client. Somehow it looks > >> like > >> the message gets ''STUCK'' in the messagebus and arrives late on the > >> client. > >> > >> Any hints on were to tackle this issue are more then welcome and > >> really appreciated . This issue is blocking the implementation of > >> mcollective on our platform which is more than just sad > >> > >> Currently I''m using MCollective 2.0.0 on Ubuntu 10.04 LTS X86_64. > >> > >> (1) > >> http://docs.puppetlabs.com/mcollective/reference/plugins/security_aes.html > >> > >> --- > >> Best regards, > >> > >> Martin > >> > >> -- > >> You received this message because you are subscribed to the Google > >> Groups "Puppet Users" group. > >> To post to this group, send email to > >> puppet-users@googlegroups.com. > >> To unsubscribe from this group, send email to > >> puppet-users+unsubscribe@googlegroups.com. > >> For more options, visit this group at > >> http://groups.google.com/group/puppet-users?hl=en. > >> > >> > > > > -- > > You received this message because you are subscribed to the Google > > Groups "Puppet Users" group. > > To post to this group, send email to puppet-users@googlegroups.com. > > To unsubscribe from this group, send email to > > puppet-users+unsubscribe@googlegroups.com. > > For more options, visit this group at > > http://groups.google.com/group/puppet-users?hl=en. > > > > > > -- > --- > Met vriendelijke groet, > > Martin Willemsma > > -- > You received this message because you are subscribed to the Google > Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to > puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/puppet-users?hl=en. > >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Martin Willemsma
2012-Jun-09  03:37 UTC
Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin
I currently have two different MQ''s running, one activeMQ running on a node and a rabbitMQ on a different one. If I switch my mcollective servers from one broker to another and do a ''mco ping'' while the nodes are registering with the other broker I hit a threshold somewhere. Scenario: I start a ''mco ping'' from 2 clients simultaneously. Each client registered to a different broker. Both can successfully do a ping to all nodes. client 1 - 199.49 ms 238.21 ms 275.27 ms 312.39 ms 350.79 ms 387.84 ms [...] 2624.43 ms 2993.36 ms ---- ping statistics ---- 50 replies max: 2993.36 min: 199.49 avg: 1030.95 client 2 - 469.72 ms 492.41 ms 516.31 ms 541.34 ms 638.03 ms 664.93 ms [...] 4254.50 ms 4276.60 ms ---- ping statistics ---- 160 replies max: 4276.60 min: 469.72 avg: 2414.46 The command does complete and ping times start way lower. I found some nodes with <512 Mb ram. Could it be possible that these are causing problem because they are swapping and do not respond quick enough and somehow that reply gets lost in translation causing the mechanism to break? If I do a ''mco rpc service status service=ssh'' on both clients client 1 - Determining the amount of hosts matching filter for 2 seconds .... 50 * [ ============================================================> ] 50 / 50 client 2 - Determining the amount of hosts matching filter for 2 seconds .... warn: Could not decrypt message from client: #<Class:0x7f16ac7ba268>: execution expired warn: Ignoring a message that did not pass security validations I''m adjusting my manifest to ensure the mcollective service is not running on nodes < 512 ram. 2012/6/8 R.I.Pienaar <rip@devco.net>:> > > ----- Original Message ----- >> From: "Martin Willemsma" <mwillemsma@gmail.com> >> To: puppet-users@googlegroups.com >> Sent: Friday, June 8, 2012 7:11:39 AM >> Subject: Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin >> >> Thanks for your response. >> One thing I noticed when using the PSK is that I indeed see the >> discovery with the progressbar. That''s one thing I never see when >> using AES. Commands always come back from discovered nodes when using >> PSK. >> >> You suggest SSL TLS, is that the same as AES provider i''m using right >> now? > > identity of the client is securely established and the payload is encrypted > using industry standards, i guess it depends on your needs though > >> >> I run the client and on the same node as the rabbitmq. Also tried >> with an activemq installation on another node in the same subnet. >> Doesn''t seem to make any difference. The node running the messagebus >> is mostly idle. 4 CPUs / 4Gb ram and on the other node 2 CPU / 2 Gb >> ram. I also tried this the client on my workstation. I5 / 8 Gb ram / >> SSD disk, same behavior. >> >> I agree that the ping times are pretty high but I could live with >> that if at least all the replies came back. > > ping times that long will just prevent everything from working. there''s > a fundamental problem somewhere. > >> I spend quite some time making this work on our platform. I need to >> look more in-depth on the network part. >> >> 2012/6/7 R.I.Pienaar <rip@devco.net>: >> > >> > >> > ----- Original Message ----- >> >> From: "Martin Willemsma" <mwillemsma@gmail.com> >> >> To: "Puppet Users" <puppet-users@googlegroups.com> >> >> Sent: Thursday, June 7, 2012 7:11:41 AM >> >> Subject: [Puppet Users] MCollective not all nodes answer to >> >> commands when using aes_security plugin >> >> >> >> Hi, >> >> >> >> I deployed MCollective to our Puppet clients. approx. ~ 200. Our >> >> platform requires the most secure setup possible, so PSK as >> >> securityprovider is not an option. >> > >> > I''d almost always suggest SSL TLS + the ssl plugin now. >> > >> >> Therefor I changed the security provider to aes_security reusing >> >> puppets certificates in the server.cfg as found in the docs (1) >> >> Our goal is to use mcollective to offload event-driven actions to >> >> agents running on designated nodes from a webapplication. >> >> >> >> e.g: send out a message to the ''platform'' collective to create a >> >> DNS >> >> record. This message should be processed by a node that runs the >> >> ''DNS'' agent. >> >> >> >> One thing I noticed after switching to the aes_security plugin is >> >> the >> >> ping latency went up and a reply to an action does not come back >> >> from >> >> all the nodes. Were does this latency come from? >> >> If I do a mco ping on the client I expect: >> >> >> >> - every node to respond >> >> - show me the ---- ping statistics ---- in the end >> >> - jump back to my console ready for the next command >> >> >> >> but it does not. Instead it shows me the output for 207 nodes and >> >> then >> >> it just "HANGS" there. >> > >> >> >> >> This output shows pingtimes hostnames omitted >> >> >> >> 1340.38 ms <- first reply >> >> 1406.25 ms >> >> 1456.71 ms >> >> 1508.19 ms >> >> 1550.52 ms >> >> 1576.07 ms >> >> 1601.15 ms >> >> 1627.40 ms >> >> 1653.23 ms >> >> 1678.26 ms >> >> [ .. omitted intentionally ] >> >> 7518.66 ms >> >> 7556.47 ms >> >> 7593.06 ms >> >> 7623.46 ms >> >> 7648.64 ms >> >> 7685.62 ms >> >> 7722.84 ms <- last reply I see on the client console >> > >> > >> > There are a few odd things here, the first reply is way too slow, >> > the >> > AES plugin is computationally very heavy and not suited for large >> > deploys yours though is not large and even then the overhead is in >> > the >> > 30 to 40ms over that of the PSK plugin on first response - the >> > effect >> > snow balls but this should not be the performance I expect. >> > >> > Second the ''mco ping'' should not run indefinitely till you stop it, >> > it >> > should run for 5 seconds and then end, does yours do that with the >> > PSK >> > plugin active? >> > >> > Hard to guess what might be the underlying cause for the above >> > combination of issues - could be a very slow machine as the mco >> > client, >> > could be issues on the network perhaps there are a lot of TCP >> > rebroadcasts or something along those lines. >> > >> > On the machines that do not respond do you see anything in their >> > logs - >> > put them in debug and make sure they got the request and replied. >> > Anything weird on your broker? Large CPU usage perhaps? >> > >> > >> >> >> >> >> >> If I check the the logfile on the client sending the command >> >> ''/var/log/mcollective.log'' the last few lines show me: >> >> >> >> D, [2012-06-07T07:39:46.470905 #15910] DEBUG -- : >> >> pluginmanager.rb:83:in `[]'' Returning cached plugin >> >> security_plugin >> >> with class MCollective::Security::Aes_security >> >> D, [2012-06-07T07:39:46.471029 #15910] DEBUG -- : >> >> aes_security.rb:202:in `deserialize'' De-Serializing using marshal >> >> D, [2012-06-07T07:39:46.471121 #15910] DEBUG -- : >> >> aes_security.rb:255:in `decrypt'' Decrypting message using private >> >> key >> >> D, [2012-06-07T07:39:46.495265 #15910] DEBUG -- : >> >> aes_security.rb:202:in `deserialize'' De-Serializing using marshal >> >> D, [2012-06-07T07:39:46.495711 #15910] DEBUG -- : stomp.rb:191:in >> >> `receive'' Waiting for a message from Stomp >> >> >> >> I can wait forever but it does not receive >> >> I use (control + break) to exit out >> >> >> >> ^C >> >> >> >> ---- ping statistics ---- >> >> 207 replies max: 6877.20 min: 616.98 avg: 3912.99 >> >> >> >> Logfile shows me: >> >> >> >> D, [2012-06-07T07:41:10.571316 #15910] DEBUG -- : client.rb:72:in >> >> `unsubscribe'' Unsubscribing reply target for discovery >> >> D, [2012-06-07T07:41:10.571496 #15910] DEBUG -- : >> >> pluginmanager.rb:83:in `[]'' Returning cached plugin >> >> connector_plugin >> >> with class MCollective::Connector::Stomp >> >> D, [2012-06-07T07:41:10.571615 #15910] DEBUG -- : stomp.rb:257:in >> >> `unsubscribe'' Unsubscribing from >> >> /topic/mcollective.discovery.reply >> >> D, [2012-06-07T07:41:10.572767 #15910] DEBUG -- : >> >> pluginmanager.rb:83:in `[]'' Returning cached plugin >> >> connector_plugin >> >> with class MCollective::Connector::Stomp >> >> D, [2012-06-07T07:41:10.572849 #15910] DEBUG -- : stomp.rb:264:in >> >> `disconnect'' Disconnecting from Stomp >> >> >> >> Same behavior with using any of the other commands ''get_fact'' , >> >> ''rpc >> >> package'' ''rpc service''. I''m just not able to do a search over the >> >> collective when using the AES plugin. >> >> >> >> If I switch switch back to PSK replies are speedy and always come >> >> back. But then again this is not want. >> >> >> >> At first I was using RabbitMQ default config. I tries some >> >> tweaking >> >> but did not seem to make any difference to the behaviour of mco. I >> >> switched to ActiveMQ 5.6 with the configfiles from puppetlabs.git. >> >> Set >> >> it up according to the docs , again played with some setttings and >> >> did >> >> not do anything at all. >> >> >> >> tcpdumps show the node running the mcollective server responds to >> >> the >> >> message send from the mcollective client. But seconds after the >> >> node >> >> replies the output gets printed on the client. Somehow it looks >> >> like >> >> the message gets ''STUCK'' in the messagebus and arrives late on the >> >> client. >> >> >> >> Any hints on were to tackle this issue are more then welcome and >> >> really appreciated . This issue is blocking the implementation of >> >> mcollective on our platform which is more than just sad >> >> >> >> Currently I''m using MCollective 2.0.0 on Ubuntu 10.04 LTS X86_64. >> >> >> >> (1) >> >> http://docs.puppetlabs.com/mcollective/reference/plugins/security_aes.html >> >> >> >> --- >> >> Best regards, >> >> >> >> Martin >> >> >> >> -- >> >> You received this message because you are subscribed to the Google >> >> Groups "Puppet Users" group. >> >> To post to this group, send email to >> >> puppet-users@googlegroups.com. >> >> To unsubscribe from this group, send email to >> >> puppet-users+unsubscribe@googlegroups.com. >> >> For more options, visit this group at >> >> http://groups.google.com/group/puppet-users?hl=en. >> >> >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google >> > Groups "Puppet Users" group. >> > To post to this group, send email to puppet-users@googlegroups.com. >> > To unsubscribe from this group, send email to >> > puppet-users+unsubscribe@googlegroups.com. >> > For more options, visit this group at >> > http://groups.google.com/group/puppet-users?hl=en. >> > >> >> >> >> -- >> --- >> Met vriendelijke groet, >> >> Martin Willemsma >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Puppet Users" group. >> To post to this group, send email to puppet-users@googlegroups.com. >> To unsubscribe from this group, send email to >> puppet-users+unsubscribe@googlegroups.com. >> For more options, visit this group at >> http://groups.google.com/group/puppet-users?hl=en. >> >> > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. >-- --- Met vriendelijke groet, Martin Willemsma -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
R.I.Pienaar
2012-Jun-09  08:47 UTC
Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin
----- Original Message -----> From: "Martin Willemsma" <mwillemsma@gmail.com> > To: puppet-users@googlegroups.com > Sent: Saturday, June 9, 2012 4:37:05 AM > Subject: Re: [Puppet Users] MCollective not all nodes answer to commands when using aes_security plugin > > I currently have two different MQ''s running, one activeMQ running on > a node and a rabbitMQ on a different one. > If I switch my mcollective servers from one broker to another and do > a ''mco ping'' while the nodes are registering with the other broker I > hit a threshold somewhere.are you saying you''re trying to mix activemq and rabbitmq in the same setup?> > Scenario: > > I start a ''mco ping'' from 2 clients simultaneously. Each client > registered to a different broker. Both can successfully do a ping > to > all nodes. > > client 1 - > > 199.49 ms > 238.21 ms > 275.27 ms > 312.39 ms > 350.79 ms > 387.84 ms > [...] > 2624.43 ms > 2993.36 ms > > ---- ping statistics ---- > 50 replies max: 2993.36 min: 199.49 avg: 1030.95 > > client 2 - > > 469.72 ms > 492.41 ms > 516.31 ms > 541.34 ms > 638.03 ms > 664.93 ms > [...] > 4254.50 ms > 4276.60 ms > > ---- ping statistics ---- > 160 replies max: 4276.60 min: 469.72 avg: 2414.46 > > The command does complete and ping times start way lower. I found > some nodes with <512 Mb ram. Could it be possible that these are causing > problem because they are swapping and do not respond quick enough and > somehow that reply gets lost in translation causing the mechanism to > break?they wont cause everything to slow down - unless they are either your client or your broker - those slow machines will just respond slowly> If I do a ''mco rpc service status service=ssh'' on both clients > > client 1 - > > Determining the amount of hosts matching filter for 2 seconds .... 50 > > * [ ============================================================> ] > 50 / 50 > > > client 2 - > > Determining the amount of hosts matching filter for 2 seconds .... > warn: Could not decrypt message from client: #<Class:0x7f16ac7ba268>: > execution expired > warn: Ignoring a message that did not pass security validationsthis might be related to not enough entropy on your machine running the client perhaps? is this still with AES or ...? -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.