Don't know if this helps with anything, but it just hung after 2days again ... nothing on the console ... top process running at the time shows the following ... anything there look "concerning"? last pid: 5196; load averages: 9.25, 15.97, 10.07 up 2+07:58:36 04:02:28 1874 processes:317 running, 1537 sleeping, 20 zombie CPU: 6.2% user, 0.0% nice, 6.7% system, 0.3% interrupt, 86.8% idle Mem: 4552M Active, 162M Inact, 684M Wired, 46M Cache, 399M Buf, 8240K Free Swap: 8192M Total, 1308M Used, 6884M Free, 15% Inuse, 1360K In, 63M Out PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 28752 root 5 96 0 427M 408M select 1 1:55 0.00% named 9720 nobody 19 97 0 402M 186M RUN 1 0:00 0.69% nsd 54395 root 16 20 0 1308M 163M kserel 0 0:00 0.00% java 8500 nobody 10 102 0 193M 86492K ucond 1 0:07 0.00% nsd 3302 102 1 96 0 158M 66100K select 1 0:37 0.00% postgres 7853 1304 1 96 0 154M 54408K select 1 0:39 0.00% postgres 10670 88 28 20 0 335M 42488K kserel 0 0:00 0.44% mysqld 4976 root 5 4 0 95444K 41740K kqread 1 1:09 0.00% named 14003 www 44 96 0 443M 41632K ucond 1 0:00 0.00% java 8528 nobody 15 96 0 188M 37904K ucond 1 0:00 0.00% nsd 5157 88 109 96 0 97620K 33704K RUN 0 0:00 0.00% mysqld 1759 www 1 4 0 167M 32276K select 1 0:01 0.00% httpd 99407 www 1 4 0 165M 31712K sbwait 0 0:02 0.00% httpd 4006 www 1 4 0 124M 31424K sbwait 1 0:01 0.29% httpd 1299 www 1 4 0 164M 31376K sbwait 1 0:02 0.00% httpd 1758 www 1 4 0 164M 31176K sbwait 0 0:02 0.00% httpd 99402 www 1 96 0 163M 29892K CPU1 1 0:03 0.00% httpd 4036 www 1 20 0 122M 28680K lockf 1 0:00 0.00% httpd 1757 www 1 4 0 158M 27856K sbwait 1 0:02 0.00% httpd 3899 www 1 96 0 160M 27688K RUN 0 0:00 0.00% httpd 4007 www 1 20 0 125M 27588K lockf 0 0:01 2.10% httpd 4525 www 1 96 0 158M 26624K RUN 1 0:00 0.00% httpd 4607 www 1 96 0 158M 26096K RUN 0 0:00 0.00% httpd 13635 88 34 96 0 92340K 25604K CPU0 0 0:00 0.05% mysqld 4024 www 1 96 0 156M 24880K RUN 1 0:00 0.10% httpd 3585 102 1 4 0 163M 24748K sbwait 1 2:56 0.00% postgres 3951 www 1 96 0 155M 24548K RUN 1 0:00 0.10% httpd 4022 www 1 96 0 155M 24320K RUN 0 0:00 0.00% httpd 3960 www 1 96 0 155M 24316K RUN 1 0:00 0.00% httpd 3388 102 1 4 0 161M 24228K sbwait 0 1:07 0.00% postgres 4023 www 1 96 0 155M 23988K RUN 1 0:00 0.00% httpd 99468 www 1 96 0 104M 23660K RUN 1 0:03 0.00% httpd 99423 www 1 4 0 154M 23456K sbwait 0 0:03 0.00% httpd 3959 www 1 -4 0 103M 23144K devfs 0 0:00 0.00% httpd 5004 www 1 4 0 154M 23032K sbwait 1 0:00 0.00% httpd 62771 www 1 -16 0 143M 22824K vnread 1 0:01 0.00% httpd 4612 www 1 96 0 153M 21936K RUN 1 0:00 0.15% httpd 4609 www 1 96 0 153M 21936K RUN 0 0:00 0.05% httpd 5180 www 1 96 0 145M 21660K RUN 0 0:12 0.00% httpd 5007 www 1 4 0 115M 21360K sbwait 0 0:00 0.29% httpd 57327 www 1 -8 0 145M 20996K biord 0 0:04 0.20% httpd 29064 www 1 -8 0 143M 20812K biord 1 0:04 0.00% httpd 99381 www 1 96 0 151M 19364K RUN 1 0:04 0.00% httpd 4682 root 1 4 0 62388K 17828K kqread 1 0:00 0.00% perl 9447 88 8 20 0 61388K 17508K kserel 0 0:00 0.05% mysqld 13457 bind 5 96 0 45724K 17424K RUN 0 0:14 0.00% named 87535 www 1 4 0 149M 17396K sbwait 1 0:09 0.00% httpd 4611 www 1 4 0 146M 17008K sbwait 1 0:00 0.00% httpd 3386 102 1 -4 0 163M 16544K semwai 0 0:51 0.00% postgres 91929 www 1 4 0 113M 16196K sbwait 0 0:04 0.00% httpd 4757 www 1 96 0 145M 16144K RUN 0 0:00 0.00% httpd 10269 88 5 20 0 57504K 16000K kserel 0 0:00 0.00% mysqld 3946 www 1 4 0 126M 15552K sbwait 1 0:01 15.00% httpd 3619 www 1 4 0 113M 15172K sbwait 1 0:00 0.00% httpd 3385 102 1 96 0 163M 14932K RUN 1 0:50 0.00% postgres 28755 102 1 4 0 159M 14760K sbwait 0 31:36 0.35% postgres ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scrappy@hub.org MSN . scrappy@hub.org Yahoo . yscrappy Skype: hub.org ICQ . 7615664
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Marc G. Fournier wrote:> > Don't know if this helps with anything, but it just hung after 2days > again ... nothing on the console ... top process running at the time > shows the following ... anything there look "concerning"?Looks like a dead/livelock between devfs and ufs but I don't have further hints about this. Last time we have this was the 7.0 age and upgrading to 7.1 fixed it IIRC...> last pid: 5196; load averages: 9.25, 15.97, 10.07 up 2+07:58:36 > 04:02:28 > 1874 processes:317 running, 1537 sleeping, 20 zombie > CPU: 6.2% user, 0.0% nice, 6.7% system, 0.3% interrupt, 86.8% idle > Mem: 4552M Active, 162M Inact, 684M Wired, 46M Cache, 399M Buf, 8240K Free > Swap: 8192M Total, 1308M Used, 6884M Free, 15% Inuse, 1360K In, 63M Out > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 28752 root 5 96 0 427M 408M select 1 1:55 0.00% named > 9720 nobody 19 97 0 402M 186M RUN 1 0:00 0.69% nsd > 54395 root 16 20 0 1308M 163M kserel 0 0:00 0.00% java > 8500 nobody 10 102 0 193M 86492K ucond 1 0:07 0.00% nsd > 3302 102 1 96 0 158M 66100K select 1 0:37 0.00% postgres > 7853 1304 1 96 0 154M 54408K select 1 0:39 0.00% postgres > 10670 88 28 20 0 335M 42488K kserel 0 0:00 0.44% mysqld > 4976 root 5 4 0 95444K 41740K kqread 1 1:09 0.00% named > 14003 www 44 96 0 443M 41632K ucond 1 0:00 0.00% java > 8528 nobody 15 96 0 188M 37904K ucond 1 0:00 0.00% nsd > 5157 88 109 96 0 97620K 33704K RUN 0 0:00 0.00% mysqld > 1759 www 1 4 0 167M 32276K select 1 0:01 0.00% httpd > 99407 www 1 4 0 165M 31712K sbwait 0 0:02 0.00% httpd > 4006 www 1 4 0 124M 31424K sbwait 1 0:01 0.29% httpd > 1299 www 1 4 0 164M 31376K sbwait 1 0:02 0.00% httpd > 1758 www 1 4 0 164M 31176K sbwait 0 0:02 0.00% httpd > 99402 www 1 96 0 163M 29892K CPU1 1 0:03 0.00% httpd > 4036 www 1 20 0 122M 28680K lockf 1 0:00 0.00% httpd > 1757 www 1 4 0 158M 27856K sbwait 1 0:02 0.00% httpd > 3899 www 1 96 0 160M 27688K RUN 0 0:00 0.00% httpd > 4007 www 1 20 0 125M 27588K lockf 0 0:01 2.10% httpd > 4525 www 1 96 0 158M 26624K RUN 1 0:00 0.00% httpd > 4607 www 1 96 0 158M 26096K RUN 0 0:00 0.00% httpd > 13635 88 34 96 0 92340K 25604K CPU0 0 0:00 0.05% mysqld > 4024 www 1 96 0 156M 24880K RUN 1 0:00 0.10% httpd > 3585 102 1 4 0 163M 24748K sbwait 1 2:56 0.00% postgres > 3951 www 1 96 0 155M 24548K RUN 1 0:00 0.10% httpd > 4022 www 1 96 0 155M 24320K RUN 0 0:00 0.00% httpd > 3960 www 1 96 0 155M 24316K RUN 1 0:00 0.00% httpd > 3388 102 1 4 0 161M 24228K sbwait 0 1:07 0.00% postgres > 4023 www 1 96 0 155M 23988K RUN 1 0:00 0.00% httpd > 99468 www 1 96 0 104M 23660K RUN 1 0:03 0.00% httpd > 99423 www 1 4 0 154M 23456K sbwait 0 0:03 0.00% httpd > 3959 www 1 -4 0 103M 23144K devfs 0 0:00 0.00% httpd > 5004 www 1 4 0 154M 23032K sbwait 1 0:00 0.00% httpd > 62771 www 1 -16 0 143M 22824K vnread 1 0:01 0.00% httpd > 4612 www 1 96 0 153M 21936K RUN 1 0:00 0.15% httpd > 4609 www 1 96 0 153M 21936K RUN 0 0:00 0.05% httpd > 5180 www 1 96 0 145M 21660K RUN 0 0:12 0.00% httpd > 5007 www 1 4 0 115M 21360K sbwait 0 0:00 0.29% httpd > 57327 www 1 -8 0 145M 20996K biord 0 0:04 0.20% httpd > 29064 www 1 -8 0 143M 20812K biord 1 0:04 0.00% httpd > 99381 www 1 96 0 151M 19364K RUN 1 0:04 0.00% httpd > 4682 root 1 4 0 62388K 17828K kqread 1 0:00 0.00% perl > 9447 88 8 20 0 61388K 17508K kserel 0 0:00 0.05% mysqld > 13457 bind 5 96 0 45724K 17424K RUN 0 0:14 0.00% named > 87535 www 1 4 0 149M 17396K sbwait 1 0:09 0.00% httpd > 4611 www 1 4 0 146M 17008K sbwait 1 0:00 0.00% httpd > 3386 102 1 -4 0 163M 16544K semwai 0 0:51 0.00% postgres > 91929 www 1 4 0 113M 16196K sbwait 0 0:04 0.00% httpd > 4757 www 1 96 0 145M 16144K RUN 0 0:00 0.00% httpd > 10269 88 5 20 0 57504K 16000K kserel 0 0:00 0.00% mysqld > 3946 www 1 4 0 126M 15552K sbwait 1 0:01 15.00% httpd > 3619 www 1 4 0 113M 15172K sbwait 1 0:00 0.00% httpd > 3385 102 1 96 0 163M 14932K RUN 1 0:50 0.00% postgres > 28755 102 1 4 0 159M 14760K sbwait 0 31:36 0.35% postgres > > ---- > Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) > Email . scrappy@hub.org MSN . scrappy@hub.org > Yahoo . yscrappy Skype: hub.org ICQ . 7615664 > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"- -- Xin LI <delphij@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAkoKc5oACgkQi+vbBBjt66DzYACfXvyb+8mB0x2jAq4z/shQ8MAS kEcAnix1xKt10A5c1aMqQK4ImJoWX/Ny =AIYf -----END PGP SIGNATURE-----
Marc, and folks, I have simillar "hang" problem on 6.4-STABLE and 7.2-STABLE servers, on which apache, squid, inn, named, isc-dhcpd and so on are running except DB servers. What kind of informations should I check to solve this annoying problem? I'm running munin-node on these machines, too. Thanks.>>>>> In <20090513040719.D17646@hub.org> >>>>> "Marc G. Fournier" <scrappy@hub.org> wrote:> Don't know if this helps with anything, but it just hung after 2days > again ... nothing on the console ... top process running at the time > shows the following ... anything there look "concerning"?-- NAKAJI Hiroyuki
On Wednesday 13 May 2009 3:09:33 am Marc G. Fournier wrote:> > Don't know if this helps with anything, but it just hung after 2days again > ... nothing on the console ... top process running at the time shows the > following ... anything there look "concerning"?Is this a 2 CPU system? If so, both CPUs are actually running something, so it is not a deadlock per se.> 99402 www 1 96 0 163M 29892K CPU1 1 0:03 0.00% httpd > 13635 88 34 96 0 92340K 25604K CPU0 0 0:00 0.05% mysqld-- John Baldwin