Roland RoLaNd
2011-Mar-07 07:31 UTC
[CentOS] BUG: soft lockup CPU stuck for 10seconds (Server went down)
Hello, Today my server stopped responding. i went to the console and on the screen there were a continuous loop of the following info shown on the screen: BUG: soft lockup - CPU#0 stuck for 10s! [java:13959] and alot of other information. ii've took a screen shot of the info shown , you can find it under the following url: http://img585.imageshack.us/i/img00012201103070833.jpg/ and had to hard reset for it to be back up and running. i tried googling with no luck for direct relevant info. so hoping you can help out Thanks, --Roland -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos/attachments/20110307/31aa0347/attachment-0002.html>
Alexander Dalloz
2011-Mar-07 07:42 UTC
[CentOS] BUG: soft lockup CPU stuck for 10seconds (Server went down)
Am 07.03.2011 08:31, schrieb Roland RoLaNd:> > Hello, > > Today my server stopped responding. > i went to the console and on the screen there were a continuous loop of the following info shown on the screen: > > BUG: soft lockup - CPU#0 stuck for 10s! [java:13959] > > and alot of other information. > ii've took a screen shot of the info shown , you can find it under the following url: http://img585.imageshack.us/i/img00012201103070833.jpg/ > and had to hard reset for it to be back up and running. > > i tried googling with no luck for direct relevant info. > so hoping you can help out > > Thanks, > > --RolandA good reason why to run CentOS 5.4 (at least its kernel)? Running remotely vulnerable Oracle Java which can be affected by a DoS throug a web application? Alexander
Frank Cox
2011-Mar-07 07:46 UTC
[CentOS] BUG: soft lockup CPU stuck for 10seconds (Server went down)
On Mon, 07 Mar 2011 09:31:42 +0200 Roland RoLaNd wrote:> BUG: soft lockup - CPU#0 stuck for 10s! [java:13959]> i tried googling with no luck for direct relevant info.The first google result for the above string takes me here: http://bugs.centos.org/view.php?id=3582 Which in turn contains a reference that takes me here: https://bugzilla.redhat.com/show_bug.cgi?id=484590 And it appears that this issue was fixed with kernel version 2.6.18-164. -- MELVILLE THEATRE ~ Melville Sask ~ www.melvilletheatre.com www.creekfm.com - FIFTY THOUSAND WATTS of POW WOW POWER!
James A. Peltier
2011-Mar-07 17:24 UTC
[CentOS] BUG: soft lockup CPU stuck for 10seconds (Server went down)
----- Original Message ----- | Hello, | | Today my server stopped responding. | i went to the console and on the screen there were a continuous loop | of the following info shown on the screen: | | BUG: soft lockup - CPU#0 stuck for 10s! [java:13959] | | and alot of other information. | ii've took a screen shot of the info shown , you can find it under the | following url: http://img585.imageshack.us/i/img00012201103070833.jpg/ | and had to hard reset for it to be back up and running. | | i tried googling with no luck for direct relevant info. | so hoping you can help out | | Thanks, | | --Roland | | _______________________________________________ | CentOS mailing list | CentOS at centos.org | http://lists.centos.org/mailman/listinfo/centos This is likely due to thread deadlock. What is the load on the machine at the time that this error occurs? I've seen this very error when running The Mathworks Distributed Computing Toolbox server. The machine would become very "unsettled" when it tried to run on more than four 2 CPUs. How many CPUs are in this system? -- James A. Peltier IT Services - Research Computing Group Simon Fraser University - Burnaby Campus Phone : 778-782-6573 Fax : 778-782-3045 E-Mail : jpeltier at sfu.ca Website : http://www.sfu.ca/itservices http://blogs.sfu.ca/people/jpeltier
David Sommerseth
2011-Mar-08 10:06 UTC
[CentOS] BUG: soft lockup CPU stuck for 10seconds (Server went down)
On 07/03/11 08:31, Roland RoLaNd wrote:> Hello, > > Today my server stopped responding. > i went to the console and on the screen there were a continuous loop of the > following info shown on the screen: > > BUG: soft lockup - CPU#0 stuck for 10s! [java:13959] > > and alot of other information. > ii've took a screen shot of the info shown , you can find it under the > following url: http://img585.imageshack.us/i/img00012201103070833.jpg/ > and had to hard reset for it to be back up and running. > > i tried googling with no luck for direct relevant info. > so hoping you can help outSome real kernel developers might have better insight on why this happens. But this hits APIC timers during a syscall. I would probably try to boot the box with 'noapic' in the kernel command line, to see if this improves things or not. Do you see the "soft lockup - CPU#0" always? or does it also happen to other CPUs as well? And if it does, is the java process running on more CPUs? kind regards, David Sommerseth