PBWebGuy
2010-Nov-10 16:40 UTC
[Puppet Users] Could not retrieve catalog from remote server - random
I have one of 18+ servers in an environment that just started having a problem when attempting to do an update. On the node I enter the command ''puppetd -t --debug --trace''. I don''t see anything obvious but the error is: "Could not retrieve catalog from remote server" I set the puppetmaster into debug mode with command ''puppetmasterd -- no-daemonize --debug -v''. When the node attempts to update there is no output by the PM. I therefore setup tcpdump to watch the traffic and there is in fact traffic as shown below. Therefore I know that there is communications. On occasions it will update but is completely random. Any suggestions? Thanks, John tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 96 bytes 11:31:02.253921 IP (tos 0x0, ttl 64, id 59181, offset 0, flags [DF], proto: TCP (6), length: 60) devcas1.domain.local.41082 > util3.domain.local.8140: S, cksum 0xb458 (correct), 297236183:297236183(0) win 5840 <mss 1380,sackOK,timestamp 666800634 0,nop,wscale 7> 11:31:02.254422 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6), length: 60) util3.domain.local.8140 > devcas1.domain.local.41082: S, cksum 0x7cda (correct), 1623860666:1623860666(0) ack 297236184 win 5792 <mss 1460,sackOK,timestamp 3608730031 666800634,nop,wscale 7> 11:31:02.255511 IP (tos 0x0, ttl 64, id 59182, offset 0, flags [DF], proto: TCP (6), length: 52) devcas1.domain.local.41082 > util3.domain.local.8140: ., cksum 0xc215 (correct), ack 1 win 46 <nop,nop,timestamp 666800637 3608730031> 11:31:02.357588 IP (tos 0x0, ttl 64, id 64523, offset 0, flags [DF], proto: TCP (6), length: 52) util3.domain.local.8140 > devcas1.domain.local.41082: F, cksum 0xc1ad (correct), 1:1(0) ack 1 win 46 <nop,nop,timestamp 3608730134 666800637> 11:31:02.358933 IP (tos 0x0, ttl 64, id 59183, offset 0, flags [DF], proto: TCP (6), length: 52) devcas1.domain.local.41082 > util3.domain.local.8140: ., cksum 0xc146 (correct), ack 2 win 46 <nop,nop,timestamp 666800740 3608730134> 11:31:02.450472 IP (tos 0x0, ttl 64, id 59184, offset 0, flags [DF], proto: TCP (6), length: 157) devcas1.domain.local.41082 > util3.domain.local.8140: P 1:106(105) ack 2 win 46 <nop,nop,timestamp 666800831 3608730134> 11:31:02.450498 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto: TCP (6), length: 40) util3.domain.local.8140 > devcas1.domain.local.41082: R, cksum 0xee79 (correct), 1623860668:1623860668(0) win 0 11:33:11.491647 IP (tos 0x0, ttl 64, id 44612, offset 0, flags [DF], proto: TCP (6), length: 334) devcas1.domain.local.55540 > util3.domain.local.ldaps: P 3058489015:3058489297(282) ack 81338168 win 128 <nop,nop,timestamp 666929878 3608244168> 11:33:11.492744 IP (tos 0x0, ttl 64, id 48557, offset 0, flags [DF], proto: TCP (6), length: 446) util3.domain.local.ldaps > devcas1.domain.local.55540: P 1:395(394) ack 282 win 155 <nop,nop,timestamp 3608859274 666929878> 11:33:11.492818 IP (tos 0x0, ttl 64, id 48558, offset 0, flags [DF], proto: TCP (6), length: 142) util3.domain.local.ldaps > devcas1.domain.local.55540: P 395:485(90) ack 282 win 155 <nop,nop,timestamp 3608859274 666929878> 11:33:11.493585 IP (tos 0x0, ttl 64, id 44613, offset 0, flags [DF], proto: TCP (6), length: 52) devcas1.domain.local.55540 > util3.domain.local.ldaps: ., cksum 0x102d (correct), ack 395 win 142 <nop,nop,timestamp 666929880 3608859274> 11:33:11.493595 IP (tos 0x0, ttl 64, id 44614, offset 0, flags [DF], proto: TCP (6), length: 52) devcas1.domain.local.55540 > util3.domain.local.ldaps: ., cksum 0x0fd3 (correct), ack 485 win 142 <nop,nop,timestamp 666929880 3608859274> -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
PBWebGuy
2010-Nov-10 16:41 UTC
[Puppet Users] Re: Could not retrieve catalog from remote server - random
Let me also add that I have several servers that have the same exact "role" in their node definitions and don''t have any problem with updates. On Nov 10, 11:40 am, PBWebGuy <pbweb...@gmail.com> wrote:> I have one of 18+ servers in an environment that just started having a > problem when attempting to do an update. On the node I enter the > command ''puppetd -t --debug --trace''. I don''t see anything obvious > but the error is: "Could not retrieve catalog from remote server" > > I set the puppetmaster into debug mode with command ''puppetmasterd -- > no-daemonize --debug -v''. When the node attempts to update there is > no output by the PM. I therefore setup tcpdump to watch the traffic > and there is in fact traffic as shown below. Therefore I know that > there is communications. > > On occasions it will update but is completely random. > > Any suggestions? > > Thanks, > > John > > tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size > 96 bytes > 11:31:02.253921 IP (tos 0x0, ttl 64, id 59181, offset 0, flags [DF], > proto: TCP (6), length: 60) devcas1.domain.local.41082 > > util3.domain.local.8140: S, cksum 0xb458 (correct), > 297236183:297236183(0) win 5840 <mss 1380,sackOK,timestamp 666800634 > 0,nop,wscale 7> > 11:31:02.254422 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > proto: TCP (6), length: 60) util3.domain.local.8140 > > devcas1.domain.local.41082: S, cksum 0x7cda (correct), > 1623860666:1623860666(0) ack 297236184 win 5792 <mss > 1460,sackOK,timestamp 3608730031 666800634,nop,wscale 7> > 11:31:02.255511 IP (tos 0x0, ttl 64, id 59182, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.41082 > > util3.domain.local.8140: ., cksum 0xc215 (correct), ack 1 win 46 > <nop,nop,timestamp 666800637 3608730031> > 11:31:02.357588 IP (tos 0x0, ttl 64, id 64523, offset 0, flags [DF], > proto: TCP (6), length: 52) util3.domain.local.8140 > > devcas1.domain.local.41082: F, cksum 0xc1ad (correct), 1:1(0) ack 1 > win 46 <nop,nop,timestamp 3608730134 666800637> > 11:31:02.358933 IP (tos 0x0, ttl 64, id 59183, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.41082 > > util3.domain.local.8140: ., cksum 0xc146 (correct), ack 2 win 46 > <nop,nop,timestamp 666800740 3608730134> > 11:31:02.450472 IP (tos 0x0, ttl 64, id 59184, offset 0, flags [DF], > proto: TCP (6), length: 157) devcas1.domain.local.41082 > > util3.domain.local.8140: P 1:106(105) ack 2 win 46 <nop,nop,timestamp > 666800831 3608730134> > 11:31:02.450498 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > proto: TCP (6), length: 40) util3.domain.local.8140 > > devcas1.domain.local.41082: R, cksum 0xee79 (correct), > 1623860668:1623860668(0) win 0 > 11:33:11.491647 IP (tos 0x0, ttl 64, id 44612, offset 0, flags [DF], > proto: TCP (6), length: 334) devcas1.domain.local.55540 > > util3.domain.local.ldaps: P 3058489015:3058489297(282) ack 81338168 > win 128 <nop,nop,timestamp 666929878 3608244168> > 11:33:11.492744 IP (tos 0x0, ttl 64, id 48557, offset 0, flags [DF], > proto: TCP (6), length: 446) util3.domain.local.ldaps > > devcas1.domain.local.55540: P 1:395(394) ack 282 win 155 > <nop,nop,timestamp 3608859274 666929878> > 11:33:11.492818 IP (tos 0x0, ttl 64, id 48558, offset 0, flags [DF], > proto: TCP (6), length: 142) util3.domain.local.ldaps > > devcas1.domain.local.55540: P 395:485(90) ack 282 win 155 > <nop,nop,timestamp 3608859274 666929878> > 11:33:11.493585 IP (tos 0x0, ttl 64, id 44613, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.55540 > > util3.domain.local.ldaps: ., cksum 0x102d (correct), ack 395 win 142 > <nop,nop,timestamp 666929880 3608859274> > 11:33:11.493595 IP (tos 0x0, ttl 64, id 44614, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.55540 > > util3.domain.local.ldaps: ., cksum 0x0fd3 (correct), ack 485 win 142 > <nop,nop,timestamp 666929880 3608859274>-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
PBWebGuy
2010-Nov-10 17:02 UTC
[Puppet Users] Re: Could not retrieve catalog from remote server - random
A few more findings: 1. I removed the /var/lib/puppet/clientbucket and ran update manually it worked one time. Repeating this does not work. 2. Here is the Trace of the puppetd command: /usr/lib/ruby/1.8/net/http.rb:586:in `connect'' /usr/lib/ruby/1.8/net/http.rb:586:in `connect'' /usr/lib/ruby/1.8/net/http.rb:553:in `do_start'' /usr/lib/ruby/1.8/net/http.rb:542:in `start'' /usr/lib/ruby/1.8/net/http.rb:1035:in `request'' /usr/lib/ruby/1.8/net/http.rb:772:in `get'' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/rest.rb:69:in `find'' /usr/lib/ruby/site_ruby/1.8/puppet/indirector/indirection.rb:202:in `find'' /usr/lib/ruby/site_ruby/1.8/puppet/indirector.rb:51:in `find'' /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:208:in `retrieve_new_catalog'' /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:418:in `thinmark'' /usr/lib/ruby/1.8/benchmark.rb:293:in `measure'' /usr/lib/ruby/1.8/benchmark.rb:307:in `realtime'' /usr/lib/ruby/site_ruby/1.8/puppet/util.rb:417:in `thinmark'' /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:207:in `retrieve_new_catalog'' /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:104:in `retrieve_catalog'' /usr/lib/ruby/site_ruby/1.8/puppet/configurer.rb:142:in `run'' /usr/lib/ruby/site_ruby/1.8/puppet/agent.rb:53:in `run'' /usr/lib/ruby/site_ruby/1.8/puppet/agent/locker.rb:21:in `lock'' /usr/lib/ruby/site_ruby/1.8/puppet/agent.rb:53:in `run'' /usr/lib/ruby/1.8/sync.rb:229:in `synchronize'' /usr/lib/ruby/site_ruby/1.8/puppet/agent.rb:53:in `run'' /usr/lib/ruby/site_ruby/1.8/puppet/agent.rb:134:in `with_client'' /usr/lib/ruby/site_ruby/1.8/puppet/agent.rb:51:in `run'' /usr/lib/ruby/site_ruby/1.8/puppet/application/puppetd.rb:103:in `onetime'' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:226:in `send'' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:226:in `run_command'' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:217:in `run'' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:306:in `exit_on_fail'' /usr/lib/ruby/site_ruby/1.8/puppet/application.rb:217:in `run'' /usr/sbin/puppetd:160 3. I''m using .25.5. Any help or suggestions are much appreciated. Thanks, John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
Patrick
2010-Nov-10 17:05 UTC
Re: [Puppet Users] Could not retrieve catalog from remote server - random
On Nov 10, 2010, at 8:40 AM, PBWebGuy wrote:> I have one of 18+ servers in an environment that just started having a > problem when attempting to do an update. On the node I enter the > command ''puppetd -t --debug --trace''. I don''t see anything obvious > but the error is: "Could not retrieve catalog from remote server" > > I set the puppetmaster into debug mode with command ''puppetmasterd -- > no-daemonize --debug -v''. When the node attempts to update there is > no output by the PM.Make sure that you get output from the puppetmaster for some things. It''s easy to look at the wrong log if you are using passenger. I''ve seen a similar error problem when the puppetmaster is overloaded so the client times out. For me, this is when It gets all 50 clients ask for a catalog at the same time (don''t ask). Could this be the problem?> I therefore setup tcpdump to watch the traffic > and there is in fact traffic as shown below. Therefore I know that > there is communications. > > On occasions it will update but is completely random. > > Any suggestions? > > Thanks, > > John > > tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size > 96 bytes > 11:31:02.253921 IP (tos 0x0, ttl 64, id 59181, offset 0, flags [DF], > proto: TCP (6), length: 60) devcas1.domain.local.41082 > > util3.domain.local.8140: S, cksum 0xb458 (correct), > 297236183:297236183(0) win 5840 <mss 1380,sackOK,timestamp 666800634 > 0,nop,wscale 7> > 11:31:02.254422 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > proto: TCP (6), length: 60) util3.domain.local.8140 > > devcas1.domain.local.41082: S, cksum 0x7cda (correct), > 1623860666:1623860666(0) ack 297236184 win 5792 <mss > 1460,sackOK,timestamp 3608730031 666800634,nop,wscale 7> > 11:31:02.255511 IP (tos 0x0, ttl 64, id 59182, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.41082 > > util3.domain.local.8140: ., cksum 0xc215 (correct), ack 1 win 46 > <nop,nop,timestamp 666800637 3608730031> > 11:31:02.357588 IP (tos 0x0, ttl 64, id 64523, offset 0, flags [DF], > proto: TCP (6), length: 52) util3.domain.local.8140 > > devcas1.domain.local.41082: F, cksum 0xc1ad (correct), 1:1(0) ack 1 > win 46 <nop,nop,timestamp 3608730134 666800637> > 11:31:02.358933 IP (tos 0x0, ttl 64, id 59183, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.41082 > > util3.domain.local.8140: ., cksum 0xc146 (correct), ack 2 win 46 > <nop,nop,timestamp 666800740 3608730134> > 11:31:02.450472 IP (tos 0x0, ttl 64, id 59184, offset 0, flags [DF], > proto: TCP (6), length: 157) devcas1.domain.local.41082 > > util3.domain.local.8140: P 1:106(105) ack 2 win 46 <nop,nop,timestamp > 666800831 3608730134> > 11:31:02.450498 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], > proto: TCP (6), length: 40) util3.domain.local.8140 > > devcas1.domain.local.41082: R, cksum 0xee79 (correct), > 1623860668:1623860668(0) win 0 > 11:33:11.491647 IP (tos 0x0, ttl 64, id 44612, offset 0, flags [DF], > proto: TCP (6), length: 334) devcas1.domain.local.55540 > > util3.domain.local.ldaps: P 3058489015:3058489297(282) ack 81338168 > win 128 <nop,nop,timestamp 666929878 3608244168> > 11:33:11.492744 IP (tos 0x0, ttl 64, id 48557, offset 0, flags [DF], > proto: TCP (6), length: 446) util3.domain.local.ldaps > > devcas1.domain.local.55540: P 1:395(394) ack 282 win 155 > <nop,nop,timestamp 3608859274 666929878> > 11:33:11.492818 IP (tos 0x0, ttl 64, id 48558, offset 0, flags [DF], > proto: TCP (6), length: 142) util3.domain.local.ldaps > > devcas1.domain.local.55540: P 395:485(90) ack 282 win 155 > <nop,nop,timestamp 3608859274 666929878> > 11:33:11.493585 IP (tos 0x0, ttl 64, id 44613, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.55540 > > util3.domain.local.ldaps: ., cksum 0x102d (correct), ack 395 win 142 > <nop,nop,timestamp 666929880 3608859274> > 11:33:11.493595 IP (tos 0x0, ttl 64, id 44614, offset 0, flags [DF], > proto: TCP (6), length: 52) devcas1.domain.local.55540 > > util3.domain.local.ldaps: ., cksum 0x0fd3 (correct), ack 485 win 142 > <nop,nop,timestamp 666929880 3608859274> > > -- > You received this message because you are subscribed to the Google Groups "Puppet Users" group. > To post to this group, send email to puppet-users@googlegroups.com. > To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. > For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en. >-- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
PBWebGuy
2010-Nov-10 17:12 UTC
[Puppet Users] Re: Could not retrieve catalog from remote server - random
> Make sure that you get output from the puppetmaster for some things. It''s easy to look at the wrong log if you are using passenger. > > I''ve seen a similar error problem when the puppetmaster is overloaded so the client times out. For me, this is when It gets all 50 clients ask for a catalog at the same time (don''t ask). Could this be the problem? >Hi Patrick, I''m running PM in --no-daemonize so I see all output. I''m not using passenger and the server is only being hit by manual invocations by nodes so it is not being overtaxed. It seems completely random in that it will work one out of 20 tries. I don''t have any other nodes having an issue. John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
PBWebGuy
2010-Nov-10 17:58 UTC
[Puppet Users] Re: Could not retrieve catalog from remote server - random
I have continued troubleshooting this and have gone as far as: 1. Removed all code from the nodes.pp for this particular node. 2. Deleted the /var/lib/puppet directory on the node and resigned with the PM. Note that the signing failed repeatedly as well where the PM did not report any messages in debug mode. The tcpdump showed traffic for every request from the node. Regards, John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.
PBWebGuy
2010-Nov-10 20:30 UTC
[Puppet Users] Re: Could not retrieve catalog from remote server - random
After 6 hours of troubleshooting, we found that there was a process (Alfresco/Tomcat) running on the node that was consuming large amounts of the CPU. Running ''top'', there was a sustained load of 2.x. As soon as I killed the process, Puppet started running perfectly! This node is a VM in a VMWare cloud. Hope that this will help someone else someday... Regards, John -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To post to this group, send email to puppet-users@googlegroups.com. To unsubscribe from this group, send email to puppet-users+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/puppet-users?hl=en.