I have a client running a CentOS 6.0 machine with cPanel. The machine is fully updated with both cPanel (RELEASE) and the OS. At first, I noticed that after cPanel's dcpumon ran (even once), applications that depend on ps lock up and iowait jumps to around 50%. Load averages start out around 20 when this happens and slowly crawl up into the hundreds. Aside from not being able to run commands like ps and some nrpe scripts, everything still seems to respond fine even with the insanely high load. We've had it online with customers hitting it with a load of 400 waiting for a convenient time to reboot, without complaints. Clarification: If you run ps, it kills your terminal session. dcpumon, ps, etc, will hang around and you can see them under top (top doesn't seem to be affected.) If you try to kill any of these, (-9, anything) they do not respond. They're indefinitely blocked. They begin producing "processes being blocked for more than 120 seconds errors" in the logs. The server runs for days between this happening without issues and it always seems to be related to dcpumon. I wrote a script that checks to see if dcpumon exists in crontab and remove it. The script runs every minute. cPanel's automatic updates tend to put it back every once in a while, and it's possible that updates ran and that dcpumon ran before my script could remove it. I see that it removed it last night (it logs removals) but don't know for sure if it ran. It probably did. It's currently running 2.6.32-71.29.1.el6.x86_64 and I am considering trying vanilla kernel build to see if that corrects the issues. The hardware is HP DL145G3, and we have several other (non-cPanel) servers that are identical running CentOS 6.0 without issue. Any ideas? Thank you, -- James Shupe, OSRE developer/ engineer BSD/ Linux support & hosting jshupe at osre.org | www.osre.org O 9032530140 | F 9032530150 | M 9035223425 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: This is a digitally signed message part URL: <http://lists.centos.org/pipermail/centos/attachments/20111027/dcca41c7/attachment-0002.sig>
Vreme: 10/27/2011 02:58 PM, James Shupe pi?e:> It's currently running 2.6.32-71.29.1.el6.x86_64 and I am considering > trying vanilla kernel build to see if that corrects the issues. The > hardware is HP DL145G3, and we have several other (non-cPanel) servers > that are identical running CentOS 6.0 without issue. > > Any ideas? >You should first try kernel from CR repository, 2.6.32-131.12.1.el6.x86_64. just run "yum install centos-release-cr -y" and update kernel (and other packages if you are up to it). You can always use "yum history undo XXX" command to revert to packages without CR repo enabled. ELRepo (www.elrepo.org) is close to having kernel-ml package for 6.x., but they are not there yet. 2 weeks ago Alan Bartlett posted that he is close to creating source and binary rpms. You can query on ELRepo mailing list. -- Ljubomir Ljubojevic (Love is in the Air) PL Computers Serbia, Europe Google is the Mother, Google is the Father, and traceroute is your trusty Spiderman... StarOS, Mikrotik and CentOS/RHEL/Linux consultant
On 10/27/2011 07:58 AM, James Shupe wrote:> I have a client running a CentOS 6.0 machine with cPanel. The machine is > fully updated with both cPanel (RELEASE) and the OS. > > At first, I noticed that after cPanel's dcpumon ran (even once), > applications that depend on ps lock up and iowait jumps to around 50%. > Load averages start out around 20 when this happens and slowly crawl up > into the hundreds. Aside from not being able to run commands like ps and > some nrpe scripts, everything still seems to respond fine even with the > insanely high load. We've had it online with customers hitting it with a > load of 400 waiting for a convenient time to reboot, without complaints. > > Clarification: If you run ps, it kills your terminal session. dcpumon, > ps, etc, will hang around and you can see them under top (top doesn't > seem to be affected.) If you try to kill any of these, (-9, anything) > they do not respond. They're indefinitely blocked. They begin producing > "processes being blocked for more than 120 seconds errors" in the logs. > The server runs for days between this happening without issues and it > always seems to be related to dcpumon. > > I wrote a script that checks to see if dcpumon exists in crontab and > remove it. The script runs every minute. cPanel's automatic updates tend > to put it back every once in a while, and it's possible that updates ran > and that dcpumon ran before my script could remove it. I see that it > removed it last night (it logs removals) but don't know for sure if it > ran. It probably did. > > It's currently running 2.6.32-71.29.1.el6.x86_64 and I am considering > trying vanilla kernel build to see if that corrects the issues. The > hardware is HP DL145G3, and we have several other (non-cPanel) servers > that are identical running CentOS 6.0 without issue. > > Any ideas?Since you are using cPanel, open up a trouble ticket with them and have them take a look at it. They are usually very responsive to problems like this and may have seen this before. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 262 bytes Desc: OpenPGP digital signature URL: <http://lists.centos.org/pipermail/centos/attachments/20111027/bdcb1fa9/attachment-0002.sig>
Reasonably Related Threads
- my cpanel can not shows up after running 'latest'.
- unable to access Linux HVM via xm console - Could not read tty from store: No such file or directory
- Re: unable to access Linux HVM via xm console - Couldnot read tty from store: No such file or directory
- Dovecot 2.3.0 assertion failure on LMTP delivery
- HP Proliant DL145g3, xen-3.1.0 and SVM