Pekka Savola
2008-Jun-30 21:14 UTC
[Dovecot] Server power loss and "Dovecot is already running with PID xxx"
Hi, I'm running Dovecot 1.0.7 (with various patches) on CentOS 5.2. The server has suffered a couple of power loss events. Dovecot is run as a standalone server. The problem is that dovecot refuses to start up at boot because the PID file from before the power loss is left behind. The message is as follows: $ /sbin/service dovecot start Starting Dovecot Imap: Error: Dovecot is already running with PID 10825 (read from /var/run/dovecot/master.pid) Fatal: Invalid configuration in /etc/dovecot.conf [FAILED] (Note: there is nothing wrong in the configuration file so the error message is somewhat misleading.) I looked at the release notes of 1.0.xx releases and they didn't mention this. Is this already a known problem? Should the start-up logic be made more robust (e.g. check whether a process corresponding to the PID actually exists)? -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings
Timo Sirainen
2008-Jul-01 07:51 UTC
[Dovecot] Server power loss and "Dovecot is already running with PID xxx"
On Tue, 2008-07-01 at 00:14 +0300, Pekka Savola wrote:> $ /sbin/service dovecot start > Starting Dovecot Imap: Error: Dovecot is already running with PID 10825 > (read from /var/run/dovecot/master.pid) > Fatal: Invalid configuration in /etc/dovecot.conf > [FAILED] > (Note: there is nothing wrong in the configuration file so the error > message is somewhat misleading.)Yes, it's a bit misleading. But I don't think I'll bother fixing it before rewriting the master/config handling for v2.0.> Is this already a known problem? > Should the start-up logic be made more robust (e.g. check whether a > process corresponding to the PID actually exists)?It already checks if the PID exists, but it doesn't check what that process is (and I don't think there is a portable way to do it anyway). I don't think it's too much to ask to delete the master.pid if in rare situations it fails to start due to a PID conflict. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20080701/7d7528b4/attachment-0002.bin>
Sean Kamath
2008-Jul-02 04:55 UTC
[Dovecot] Server power loss and "Dovecot is already running with PID xxx"
On Jul 1, 2008, at 12:51 AM, Timo Sirainen wrote:>> Is this already a known problem? >> Should the start-up logic be made more robust (e.g. check whether a >> process corresponding to the PID actually exists)? > > It already checks if the PID exists, but it doesn't check what that > process is (and I don't think there is a portable way to do it > anyway). > I don't think it's too much to ask to delete the master.pid if in rare > situations it fails to start due to a PID conflict.This is a pet peeve of mine for many services started at boot time. Since the ordering of service startup is usually fairly static, a *LOT* of times process IDs are nearly identical on boot. Depending on which way they go, if they drift towards earlier, you'll have the PID in use. This drove me NUTS with Sun's LDAP server. Many recent OSes are now using memory-based filesystems for /var/run, or otherwise clear out /var/run at boot time. But if a process stores its PID somewhere else, you're SOL (much like Sun One Directory Server does). The problem with having to remove a master.pid file on boot is that you might have a BUNCH of clients or customers that are using your system, and you're either asleep at 3am when the server kicked over, or in another state. It's not a problem if you have staff watching machines reboot. ;-) Sorry, had to kibitz. Sean PS I often times add a 'rm $PID' line in the init.d script, and let a server die because it couldn't bind to the port. That doesn't work with everything, though.
Pekka Savola
2008-Aug-02 17:22 UTC
[Dovecot] Server power loss and "Dovecot is already running with PID xxx"
On Tue, 1 Jul 2008, Timo Sirainen wrote:>> $ /sbin/service dovecot start >> Starting Dovecot Imap: Error: Dovecot is already running with PID 10825 >> (read from /var/run/dovecot/master.pid) >> Fatal: Invalid configuration in /etc/dovecot.conf >> [FAILED] >> (Note: there is nothing wrong in the configuration file so the error >> message is somewhat misleading.) > > Yes, it's a bit misleading. But I don't think I'll bother fixing it > before rewriting the master/config handling for v2.0. > >> Is this already a known problem? >> Should the start-up logic be made more robust (e.g. check whether a >> process corresponding to the PID actually exists)? > > It already checks if the PID exists, but it doesn't check what that > process is (and I don't think there is a portable way to do it anyway). > I don't think it's too much to ask to delete the master.pid if in rare > situations it fails to start due to a PID conflict.Getting back to this after another power loss. It doesn't seem to be that the current logic is working; there is no program with the PID that's in master.pid, and dovecot (1.0.7 + RHEL patches) refuses to start. root: /root$ /sbin/service dovecot start Starting Dovecot Imap: Error: Dovecot is already running with PID 2746 (read from /var/run/dovecot/master.pid) Fatal: Invalid configuration in /etc/dovecot.conf [FAILED] root: /root$ more /var/run/dovecot/master.pid 2746 root: /root$ ps auxw | grep 2746 root 31714 0.0 0.1 4116 584 pts/1 R+ 20:19 0:00 grep 2746 -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings