Chuck Mattern
2007-Feb-18 13:52 UTC
[CentOS] CentOS 4.4-IBM Netvista Performace Problems, help needed.
I've got an odd situation that I need some advise on. I have two computers that I am planning to use as a cluster. I initially started with some left over Compaq Presairos with 667MHz CPUs. I loaded CentOS 4.3 and later updated to 4.4. Things ran normally, albeit slowly. I had an opportunity to upgrade to a pair of IBM Netvistas with 2.26 GHz CPUs, I did this by transferring the 160GB Western Digital IDE disks and NICs but did not re-install the OS, just migrated the disks. Since then they have had the following symptoms: -Systems frequently boot faster than the disks can be spun up and have to be soft booted to recognize and boot from the disks. -Systems will bog down critically after approximately 24 hours loosing system time at an increasing rate, ofrr instance a for loop that runs date, hwclock ans then sleeps for 10 minutes will show time in sync for the first few hours then the system time will begin to fall behind at an increasing rate, after 24 hours the system time essentially stops elapsing. It almost feels like the box has trouble processing interrupts. Once it gets to this state performance becomes very sluggish, for instance top will take up to 90 seconds to display it's first screen and will not update on it's own, only when Enter is depressed. At times top will show 0's across the utilization line for everything including idle. I have gone as far as to boot one of the boxes into single user mode and run the date/hwclock loop and even in that state the system will bog down and gradually stop elapsing time after 18-24 hours. Even shutting down is impacted. A reboot will take well over an hour to process. I had a copy of Ubuntu on my desk and have booted into that distro from cd and it passes the date/hwclock test (actually lost 2 seconds over a 24 hour period but I can live with that via ntp). I'm downloading a copy of the CentOS live 4.4 cd and will try this with that as well but at this time does anyone see that this could be something other than a disk incompatibility with the newer systems? Should I try re-installing? If it is the disks, any thoughts on something I could try to avoid buying new disks (I have tried setting the BIOS to both the high performance and legacy disk modes (not entirely sure what's behind that IBMism)). Regards, Chuck
William L. Maltby
2007-Feb-18 16:52 UTC
[CentOS] CentOS 4.4-IBM Netvista Performace Problems, help needed.
On Sun, 2007-02-18 at 08:52 -0500, Chuck Mattern wrote:> I've got an odd situation that I need some advise on. I have two > computers that I am planning to use as a cluster. I initially started > with some left over Compaq Presairos with 667MHz CPUs. I loaded CentOS > 4.3 and later updated to 4.4. Things ran normally, albeit slowly. I > had an opportunity to upgrade to a pair of IBM Netvistas with 2.26 GHz > CPUs, I did this by transferring the 160GB Western Digital IDE disks and > NICs but did not re-install the OS, just migrated the disks. Since then > they have had the following symptoms: > > -Systems frequently boot faster than the disks can be spun up and have > to be soft booted to recognize and boot from the disks.I doubt this is related, but I had a similar situation with a couple brand new disks that I installed. Thought I would mention, JIC. Us older folks used to store unused jumpers on pins on the HDS. A ground-to- ground connection never did any harm. I used this same scheme on the new disks. The bootable master had no problems. The secondary on IDE-2 did exactly what you described. One day I got sick of it, popped it out, noted that the pins were "undocumented" and removed the jumper. Problem solved. *sigh* Back to Scotch taping spare caps to the HD case and replacing the deteriorated tape every once-in-a-while. Anyway, could you have a similar situation that is causing some long- term effect that I did not see (either it wasn't there or my load didn't cause it to become noticeable or I am unobservant). <snip> HTH -- Bill
Zoran Milojevic
2007-Feb-19 00:28 UTC
[CentOS] CentOS 4.4-IBM Netvista Performace Problems, help needed.
On 2/18/07, Chuck Mattern <camattern at acm.org> wrote:> -Systems will bog down critically after approximately 24 hours loosing > system time at an increasing rate,We have several IBM boxes (NetVista mostly) @work that would exhibit similar behavior - run normally at first, then after a few hours the system clock practically stops; I measured 2 minutes of wall-clock time for a "sleep 1" to return, and up to 20 seconds for "usleep 1"... Tried updating BIOS, kernel (4.3, 4.4, updates), some combinations of boot time parameters (as in: clock={pit|pmtmr|..}, noapic, acpi=off and the like), without improvement. We just gave up on those due to the lack of time. See also: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=203818 Solution from comment 18 may help (that comment was submitted after we've given up, not sure if it was tried). Cheers, Zoran -- "The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in. We're computer professionals. We cause accidents." --Nathaniel Borenstein, inventor of MIME
Thomas Dukes
2007-Feb-19 02:01 UTC
[CentOS] CentOS 4.4-IBM Netvista Performace Problems, help needed.
Don't know if this related but I have a similar problem as well on my NetVista 2.53 Ghz. However, it seems to be kernel related. I can run 2.6.9-34.0.2 with no problems. Any kernel after this and I experience the similar problems as you. Sorry I can't help but just letting you know you're not alone. I have the 'latest' bios installed as well. -Eddie -----Original Message----- From: centos-bounces at centos.org [mailto:centos-bounces at centos.org] On Behalf Of Chuck Mattern Sent: Sunday, February 18, 2007 8:53 AM To: centos at centos.org Subject: [CentOS] CentOS 4.4-IBM Netvista Performace Problems, help needed. I've got an odd situation that I need some advise on. I have two computers that I am planning to use as a cluster. I initially started with some left over Compaq Presairos with 667MHz CPUs. I loaded CentOS 4.3 and later updated to 4.4. Things ran normally, albeit slowly. I had an opportunity to upgrade to a pair of IBM Netvistas with 2.26 GHz CPUs, I did this by transferring the 160GB Western Digital IDE disks and NICs but did not re-install the OS, just migrated the disks. Since then they have had the following symptoms: -Systems frequently boot faster than the disks can be spun up and have to be soft booted to recognize and boot from the disks. -Systems will bog down critically after approximately 24 hours loosing system time at an increasing rate, ofrr instance a for loop that runs date, hwclock ans then sleeps for 10 minutes will show time in sync for the first few hours then the system time will begin to fall behind at an increasing rate, after 24 hours the system time essentially stops elapsing. It almost feels like the box has trouble processing interrupts. Once it gets to this state performance becomes very sluggish, for instance top will take up to 90 seconds to display it's first screen and will not update on it's own, only when Enter is depressed. At times top will show 0's across the utilization line for everything including idle. I have gone as far as to boot one of the boxes into single user mode and run the date/hwclock loop and even in that state the system will bog down and gradually stop elapsing time after 18-24 hours. Even shutting down is impacted. A reboot will take well over an hour to process. I had a copy of Ubuntu on my desk and have booted into that distro from cd and it passes the date/hwclock test (actually lost 2 seconds over a 24 hour period but I can live with that via ntp). I'm downloading a copy of the CentOS live 4.4 cd and will try this with that as well but at this time does anyone see that this could be something other than a disk incompatibility with the newer systems? Should I try re-installing? If it is the disks, any thoughts on something I could try to avoid buying new disks (I have tried setting the BIOS to both the high performance and legacy disk modes (not entirely sure what's behind that IBMism)). Regards, Chuck _______________________________________________ CentOS mailing list CentOS at centos.org http://lists.centos.org/mailman/listinfo/centos