Götz Reinicke - IT Koordinator
2009-Oct-27 21:35 UTC
[CentOS] Debugging system load - How to start?
Hi, we run an "old" mailserver system which was set up a couple of years ago. The systme dose "everything" what we need(ed). Over the last days I noticed an unnormal increase of the system load up to 10 and lots of users told me that there mailclient connections (sending and receiving) are dropped from time to time. I was planing to exchange the server respectively distribut the services in the near future anyway, but I'm interessted in what causes the load now or where the system "hangs" :-) There is a lot of work to do and the setup is not very well designed, but at the time I started the mailsystem at our place, I had only this one server and a lot of user requests ... The System: Intel Pentium D 3.20GHz, 8 GB RAM, 3Ware 4*320GB Sata II Raid Level 5, Gigabit LAN. Red Hat EL 5.4 (still 32 Bit) About 700 Users, 1GB Mailboxquota (mbox), Webmail-System Horde, an avarage of 3.200 messages per day over the last 12 Month. The Services and setup: Dovecot imap(s) & pop3(s), mailscanner, spamassassin, mysql, bind, httpd, sendmail. Because there are so many config parameters I'll summarise this a little bit. Mails are checked agains two blacklists (heise.de and Spamhaus) by sendmail, mailscanner and spamassassin use mostly the default settings. Virusscanning is done by avira AV. Logging for mailwatch to mysql is activated. What tools or logfiles may give me a clue which settings should be checked or changed? Thanks and best regards, G?tz -- G?tz Reinicke IT-Koordinator Tel. +49 7141 969 420 Fax +49 7141 969 55 420 E-Mail goetz.reinicke at filmakademie.de Filmakademie Baden-W?rttemberg GmbH Akademiehof 10 71638 Ludwigsburg www.filmakademie.de Eintragung Amtsgericht Stuttgart HRB 205016 Vorsitzende des Aufsichtsrats: Prof. Dr. Claudia H?bner Staatsr?tin f?r Demographischen Wandel und f?r Senioren im Staatsministerium Gesch?ftsf?hrer: Prof. Thomas Schadt
> I was planing to exchange the server respectively distribut the services > in the near future anyway, but I'm interessted in what causes the load > now or where the system "hangs" :-)That still does not get to the bottom of the issue, which of course would be nice. You should be running some kind of performance metrics gathering. I would recommend "Munin" for long term monitoring. The only real issue with it is that it has a 5 minute window hard-coded so you cannot catch anything finer grain than that. But sounds like yours is much more than a short, spurious issue. If you want finer grain, use sadc and ksar. The 2 below is collecting at 2 second intervals which is pretty tight. /usr/lib/sa/sadc -d -I -F 2 /var/log/foo/bar or on a 64 bit box /usr/lib64/sa/sadc -d -I -F 2 /var/log/foo/bar It will log system stats at 2 second intervals, until you control-C it. Then you can view the file in kSar. go to "File" then "New Window" from the new window go to "Data", "Local Command" enter the command sar -A -f /var/log/foo/bar That's it! Now you will have some pretty graphs to look at! -- ?Don't eat anything you've ever seen advertised on TV? - Michael Pollan, author of "In Defense of Food"
Hi, On Tue, Oct 27, 2009 at 9:35 PM, G??tz Reinicke - IT Koordinator <goetz.reinicke at filmakademie.de> wrote:> we run an "old" mailserver system which was set up a couple of years > ago. The systme dose "everything" what we need(ed). Over the last days I > noticed an unnormal increase of the system load up to 10 and lots of > users told me that there mailclient connections (sending and receiving) > are dropped from time to time.Try running Nmon over a couple of days collecting stats every 10-20 seconds (a bit excessive) or a minute. Then run the nmon stats through the Nmon Analyser spreadsheet. That should show how the I/O, RAM and CPU usage and top CPU consumers during those periods pretty nicely. Download Nmon for RHEL (perfectly usable on CentOS) from IBM or sourceforge : http://www.ibm.com/developerworks/wikis/display/WikiPtype/nmon , http://nmon.sourceforge.net/pmwiki.php and the analyser is here: http://www.ibm.com/developerworks/wikis/display/WikiPtype/nmonanalyser Here's a sample on how to run: #Intensive, every 10 second. Will use a lot of CPU nmon -fT -s 10 -c 8640 #Regular, a sample every minute nmon -fT -s 60 -c 1440 -- Hakan (m1fcj) - http://www.hititgunesi.org