Adam McDougall
2008-Sep-02 13:48 UTC
[Dovecot] "pam_start() failed: system error" with dovecot 1.1.2, cause unknown
I would guess this is unlikely to be dovecot's fault, but I'm wondering if anyone has any ideas of what might have happened based on the evidence. My best guess is some kind of resource limit was reached but I don't see any evidence in the logs, and the condition is now gone. Suddenly this morning, one (and only one) of my dovecot servers decided to start failing all logins since 08:25:04 until we restarted dovecot, at which point they were working fine. The number of imap-login processes was under the limit, but there were some obvious PAM errors at the time. My account could still ssh to the system so I don't think it was a problem general authentication, and NIS on other systems was working fine. No one was logged into that server at the time the problems occurred, and I don't think anything happened to the actual pam libraries to make them missing since dovecot worked after a restart. I should have used other means to prevent people from using that dovecot instance rather than stopping it, and I'll do so if it happens again in hopes of further debugging. /var/log/maillog: Sep 2 08:25:01 hill dovecot: imap-login: Login: user=<userA>, method=PLAIN, rip=127.0.0.1, lip=127.0.0.1, secured Sep 2 08:25:01 hill dovecot: IMAP(userA): Disconnected: Logged out bytes=127/568 Sep 2 08:25:01 hill dovecot: imap-login: Login: user=<userA>, method=PLAIN, rip=127.0.0.1, lip=127.0.0.1, secured Sep 2 08:25:01 hill dovecot: IMAP(userA): Disconnected: Logged out bytes=282/9641 Sep 2 08:25:04 hill dovecot: imap-login: Login: user=<userA>, method=PLAIN, rip=127.0.0.1, lip=127.0.0.1, secured Sep 2 08:25:04 hill dovecot: IMAP(userA): Disconnected: Logged out bytes=46/543 ***problem started here Sep 2 08:25:04 hill dovecot: auth-worker(default): pam(userA,127.0.0.1): pam_start() failed: system error Sep 2 08:25:04 hill dovecot: auth-worker(default): pam(userB,35.9.37.164): pam_start() failed: system error Sep 2 08:25:05 hill dovecot: auth-worker(default): pam(userC,35.9.37.164): pam_start() failed: system error Sep 2 08:25:06 hill dovecot: imap-login: Aborted login (auth failed, 1 attempts): user=<userB>, method=PLAIN, rip=35.9.37.164, lip=35.9.37.190, TLS Sep 2 08:25:06 hill dovecot: imap-login: Aborted login (auth failed, 1 attempts): user=<userA>, method=PLAIN, rip=127.0.0.1, lip=127.0.0.1, secured Sep 2 08:25:07 hill dovecot: imap-login: Aborted login (auth failed, 1 attempts): user=<userC>, method=PLAIN, rip=35.9.37.164, lip=35.9.37.190, TLS ..... /var/log/messages: Sep 2 08:25:04 hill dovecot-auth: in openpam_load_module(): no pam_permit.so found Sep 2 08:25:04 hill dovecot-auth: in openpam_load_module(): no pam_login_access.so found Sep 2 08:25:05 hill dovecot-auth: in openpam_load_module(): no pam_nologin.so found Sep 2 08:25:10 hill dovecot-auth: in openpam_load_module(): no pam_unix.so found Sep 2 08:25:11 hill dovecot-auth: in openpam_load_module(): no pam_unix.so found Sep 2 08:25:20 hill dovecot-auth: in openpam_load_module(): no pam_opieaccess.so found Sep 2 08:25:20 hill dovecot-auth: in openpam_load_module(): no pam_opie.so found Sep 2 08:25:51 hill kernel: Sep 2 08:25:51 hill last message repeated 12 times Sep 2 08:27:52 hill kernel: Sep 2 08:27:52 hill last message repeated 37 times Sep 2 08:38:01 hill kernel: Sep 2 08:38:01 hill last message repeated 144 times Sep 2 08:48:06 hill kernel: Sep 2 08:48:06 hill last message repeated 129 times Sep 2 08:53:36 hill kernel: Sep 2 08:52:51 hill last message repeated 51 times
Timo Sirainen
2008-Sep-02 13:59 UTC
[Dovecot] "pam_start() failed: system error" with dovecot 1.1.2, cause unknown
On Sep 2, 2008, at 4:48 PM, Adam McDougall wrote:> I would guess this is unlikely to be dovecot's fault, but I'm > wondering if anyone has any ideas of what might have happened based > on the evidence. My best guess is some kind of resource limit was > reached but I don't see any evidence in the logs, and the condition > is now gone. > > Suddenly this morning, one (and only one) of my dovecot servers > decided to start failing all logins since 08:25:04 until we > restarted dovecot, at which point they were working fine. The > number of imap-login processes was under the limit, but there were > some obvious PAM errors at the time. My account could still ssh to > the system so I don't think it was a problem general authentication, > and NIS on other systems was working fine. No one was logged into > that server at the time the problems occurred, and I don't think > anything happened to the actual pam libraries to make them missing > since dovecot worked after a restart. I should have used other > means to prevent people from using that dovecot instance rather than > stopping it, and I'll do so if it happens again in hopes of further > debugging.Maybe your PAM plugins are leaking memory/fds. Have you set auth_worker_max_request_count to non-zero? That could help. -------------- next part -------------- A non-text attachment was scrubbed... Name: PGP.sig Type: application/pgp-signature Size: 194 bytes Desc: This is a digitally signed message part URL: <http://dovecot.org/pipermail/dovecot/attachments/20080902/281d9e80/attachment-0002.bin>