Andrew Tranquada
2010-Mar-26 14:56 UTC
[Samba] Winbind eventually locks "forever" if one of ActiveDirectory refuses all connections
I see this was created as bug 7259 but I did not see anything in the mailing list about this problem. Does anyone else have a problem like this? Is there something in my configuration that is incorrect? We have two domain controllers, and if we reboot either one of them, winbind hangs, and we cannot lookup any ids, and since logins are requiring group lookups, it makes logging in as a local user hang, effectively locking us out of the box. If we continue to try as a local user we can eventually get in, but it is less than ideal and scares everyone when you cannot log in. Not rebooting the AD servers is not an option, we do keep our boxes patched with updates. What appears to happen is that rebooting one of the AD servers causes winbind to get some kind of error, and stop listening on /tmp/.winbind/pipe when we do an lsof of /tmp/.winbind/pipe and then strace -p any of the winbind processes,none of them are looking (in their select) at the file descriptor(s) listed by lsof. So it seems that when one ad server is restarted, winbind does not like it and errors, and stops listening on that pipe, and when any communication happens (sid-uid lookups), since no one is responding on that pipe/socket, it hangs. This is with samba 3.4.5 our samba config: netbios name = nimdev-afs1 workgroup = <redacted> security = ads realm = <redacted> kerberos method = system keytab idmap backend = hash idmap uid = 4000-100000000 idmap gid = 4000-100000000 winbind enum users = yes winbind enum groups = yes auth methods = winbind template shell = /bin/bash template homedir = /home/%U winbind normalize names = yes winbind use default domain = yes allow trusted domains = no winbind cache time = 3600 What more information can I provide that would be helpful? Thank you -- Andrew Tranquada
Andrew Tranquada
2010-Mar-26 14:58 UTC
[Samba] Winbind eventually locks "forever" if one of ActiveDirectory refuses all connections
I do have winbind running in debug mode 10 and currently I have one of the servers in this state, (so if someone lets me know what will help I can get it to them. On Fri, Mar 26, 2010 at 10:56 AM, Andrew Tranquada < andrew.tranquada at gmail.com> wrote:> I see this was created as bug 7259 but I did not see anything in the > mailing list about this problem. > Does anyone else have a problem like this? Is there something in my > configuration that is incorrect? > We have two domain controllers, and if we reboot either one of them, > winbind hangs, and we cannot lookup any ids, and since logins are requiring > group lookups, it makes logging in as a local user hang, effectively locking > us out of the box. If we continue to try as a local user we can eventually > get in, but it is less than ideal and scares everyone when you cannot log > in. Not rebooting the AD servers is not an option, we do keep our boxes > patched with updates. > What appears to happen is that rebooting one of the AD servers causes > winbind to get some kind of error, and stop listening on /tmp/.winbind/pipe > when we do an lsof of /tmp/.winbind/pipe > and then strace -p any of the winbind processes,none of them are looking > (in their select) at the file descriptor(s) listed by lsof. So it seems that > when one ad server is restarted, winbind does not like it and errors, and > stops listening on that pipe, and when any communication happens (sid-uid > lookups), since no one is responding on that pipe/socket, it hangs. > This is with samba 3.4.5 > > our samba config: > netbios name = nimdev-afs1 > workgroup = <redacted> > security = ads > realm = <redacted> > kerberos method = system keytab > idmap backend = hash > idmap uid = 4000-100000000 > idmap gid = 4000-100000000 > winbind enum users = yes > winbind enum groups = yes > auth methods = winbind > template shell = /bin/bash > template homedir = /home/%U > winbind normalize names = yes > winbind use default domain = yes > allow trusted domains = no > winbind cache time = 3600 > > > What more information can I provide that would be helpful? > > Thank you > > > > -- > Andrew Tranquada >-- Andrew Tranquada
Possibly Parallel Threads
- APC Smart-UPS 1500; system shutdown is not initiated.
- APC Smart-UPS 1500; system shutdown is not initiated.
- Failed to join domain: failed to precreate account in ou (null): Out of memory
- wbinfo -r reports strange gids on AD member
- APC Smart-UPS 1500; system shutdown is not initiated.