Alain-Pierre Perrin
2017-Jan-25 08:45 UTC
[Samba] Windind (Samba 4.2.*, 4.5.2) recurring resolving failure for some specific users
Hello.
I'm facing an seemingly unsolvable problem on the Samba servers I
administer (on Debian stable). Those servers are registered on a
AD domain. They only serve files and are not registered as domain
controllers. For some idendified users (always the same), Winbind
periodically (but unpredicably) becomes unable to resolve their names,
making their shares unavailable. A "net cache flush" temporarily
solves the problem. Purging all caches doesn't help. Removing then
adding again the servers on the domain doesn't help either. The
problem appeared on Samba 4.2.10 (on Debian) and persisted on 4.2.14
and 4.5.2 (testing).
The only solution, for now, is more a "patch" and consists to run
a "net cache flush" every 10 minutes. It helps, even if it is not
perfect but it doesn't explain why those identified users suffer from
this weird Samba behavior.
It is a IDMAP RID bug ? Does the impacted users share some common
AD/LDAP attributes making winbind choke ? What kind of log would be
the most enlightening do study this hard to reproduce bug ?
Thanks in advance for your collective help / wisdom.
Alain-Pierre Perrin
PS: Some configuration details :
# Samba config, through testparm and anonymized
# cat /etc/samba/smb.conf
[global]
bind interfaces only = Yes
dos charset = 850
interfaces = 127.0.0.1 10.100.0.1
realm = OURDOMAIN.PARENTDOMAIN
server string = ""
workgroup = OURDOMAIN
domain master = No
local master = No
preferred master = No
machine password timeout = 0
debug prefix timestamp = Yes
log file = /var/log/samba/log.%m
max log size = 100
disable spoolss = Yes
load printers = No
printcap name = /dev/null
name resolve order = host bcast
map untrusted to domain = Yes
ntlm auth = Yes
security = ADS
socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
winbind refresh tickets = Yes
winbind use default domain = Yes
dns proxy = No
idmap config otherdomain:range = 480000-509999
idmap config otherdomain:backend = rid
idmap config ourdomain:range = 30000-59999
idmap config ourdomain:backend = rid
idmap config *:range = 2000-29999
full_audit:priority = NOTICE
full_audit:facility = local6
full_audit:failure = none
full_audit:success = mkdir rename unlink rmdir pwrite write
full_audit:prefix = Audit - USER=%u | IP=%I | MACHINE=%m | VOLUME=%S
idmap config * : backend = tdb
map archive = No
map readonly = permissions
printing = bsd
create mask = 0660
directory mask = 0770
force create mode = 0660
force directory mode = 0770
inherit acls = Yes
read only = No
vfs objects = full_audit
[share1]
path = /home/share1
hosts allow = 127. 10.
# cat /etc/krb5.conf :
[libdefaults]
default_realm = OURDOMAIN.PARENTDOMAIN
dns_lookup_realm = false
dns_lookup_kdc = true
[realms]
OURDOMAIN.PARENTDOMAIN = {
kdc = dc01.ourdomain.parentdomain:88
kdc = dc02.ourdomain.parentdomain:88
kdc = dc03.ourdomain.parentdomain:88
kdc = dc04.ourdomain.parentdomain:88
default_domain = ourdomain.parentdomain
}
OTHERDOMAIN.PARENTDOMAIN = {
kdc = dc01.otherdomain.parentdomain:88
kdc = dc02.otherdomain.parentdomain:88
default_domain = otherdomain.parentdomain
}
[domain_realm]
.ourdomain.parentdomain = OURDOMAIN.PARENTDOMAIN
ourdomain.parentdomain = OURDOMAIN.PARENTDOMAIN
.otherdomain.parentdomain = OTHERDOMAIN.PARENTDOMAIN
otherdomain.parentdomain = OTHERDOMAIN.PARENTDOMAIN
# cat /proc/version
Linux version 4.8.0-0.bpo.2-amd64 (debian-kernel at lists.debian.org) (gcc
version 4.9.2 (Debian 4.9.2-10) ) #1 SMP Debian 4.8.11-1~bpo8+1 (2016-12-14)
# dpkg -l | grep -i samba
ii libnss-winbind:amd64 2:4.5.2+dfsg-2 amd64 Samba
nameservice integration plugins
ii libwbclient0:amd64 2:4.5.2+dfsg-2 amd64 Samba
winbind client library
ii python-samba 2:4.5.2+dfsg-2 amd64
Python bindings for Samba
ii samba 2:4.5.2+dfsg-2 amd64
SMB/CIFS file, print, and login server for Unix
ii samba-common 2:4.5.2+dfsg-2 all
common files used by both the Samba server and client
ii samba-common-bin 2:4.5.2+dfsg-2 amd64 Samba
common files used by both the server and the client
ii samba-dsdb-modules 2:4.5.2+dfsg-2 amd64 Samba
Directory Services Database
ii samba-libs:amd64 2:4.5.2+dfsg-2 amd64 Samba
core libraries
ii samba-vfs-modules 2:4.5.2+dfsg-2 amd64 Samba
Virtual FileSystem plugins
# cat /etc/debian_version
8.6
Volker Lendecke
2017-Jan-25 09:28 UTC
[Samba] Windind (Samba 4.2.*, 4.5.2) recurring resolving failure for some specific users
On Wed, Jan 25, 2017 at 09:45:25AM +0100, Alain-Pierre Perrin via samba wrote:> Hello. > > > I'm facing an seemingly unsolvable problem on the Samba servers I > administer (on Debian stable). Those servers are registered on a > AD domain. They only serve files and are not registered as domain > controllers. For some idendified users (always the same), Winbind > periodically (but unpredicably) becomes unable to resolve their names, > making their shares unavailable. A "net cache flush" temporarily > solves the problem. Purging all caches doesn't help. Removing then > adding again the servers on the domain doesn't help either. The > problem appeared on Samba 4.2.10 (on Debian) and persisted on 4.2.14 > and 4.5.2 (testing). > > The only solution, for now, is more a "patch" and consists to run > a "net cache flush" every 10 minutes. It helps, even if it is not > perfect but it doesn't explain why those identified users suffer from > this weird Samba behavior. > > It is a IDMAP RID bug ? Does the impacted users share some common > AD/LDAP attributes making winbind choke ? What kind of log would be > the most enlightening do study this hard to reproduce bug ?winbind debug level 10 logs at the time of failure help. All Samba log files starting with "log.w*" are needed. Volker
Alain-Pierre Perrin
2017-Feb-21 19:09 UTC
[Samba] Windind (Samba 4.2.*, 4.5.2, 4.5.4) recurring resolving failure for some specific users
Hello Volker. I come back after a long time. I captured today two Samba traces (log level globally cranked to 10) comparing, with a problematic user, a successful session and, later, a failed one. With one "net cache flush", the session will succeed again and... later (can be 2 minutes, can be 40), will fail again. This elusive bug is killing me. For some really annoyed users I recreated their AD account from scratch and the new user, with the same group memberships, works perfectly... but I'd really like to understand what is the mysterious AD property making Samba stutter. Here is the pivotal point in the traces : 1: Successful session [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] Create local NT token for problemuser [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] Parsing value for key [IDMAP/SID2XID/S-1-5-21-4107461286-2373360197-1842339316-6863]: value=[36863:B] [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] Parsing value for key [IDMAP/SID2XID/S-1-5-21-4107461286-2373360197-1842339316-6863]: id=[36863], endptr=[:B] [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] sid S-1-5-21-4107461286-2373360197-1842339316-6863 -> uid 36863 [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] sys_getgrouplist: user [problemuser] [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] gid 30513 -> sid S-1-5-21-4107461286-2373360197-1842339316-513 [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] push_sec_ctx(0, 0) : sec_ctx_stack_ndx = 1 [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] push_conn_ctx(0) : conn_ctx_stack_ndx = 0 [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] setting sec ctx (0, 0) - sec_ctx_stack_ndx = 1 [xxxx-xx-xx xx:xx:xx, 5, pid=xxx, effective(0, 0), real(0, 0)] Security token: (NULL) [xxxx-xx-xx xx:xx:xx, 5, pid=xxx, effective(0, 0), real(0, 0)] UNIX token of user 0 Primary group is 0 and contains 0 supplementary groups [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 0 [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] get_privileges: No privileges assigned to SID [S-1-5-21-4107461286-2373360197-1842339316-6863] [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] get_privileges: No privileges assigned to SID [S-1-5-21-4107461286-2373360197-1842339316-513] [xxxx-xx-xx xx:xx:xx, 4, pid=xxx, effective(0, 0), real(0, 0)] get_privileges: No privileges assigned to SID [S-1-5-21-4107461286-2373360197-1842339316-8531] (...long enumeration of groups...) 2: Failed session [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] Create local NT token for problemuser [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] Parsing value for key [IDMAP/SID2XID/S-1-5-21-4107461286-2373360197-1842339316-6863]: value=[36863:B] [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] Parsing value for key [IDMAP/SID2XID/S-1-5-21-4107461286-2373360197-1842339316-6863]: id=[36863], endptr=[:B] [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] sid S-1-5-21-4107461286-2373360197-1842339316-6863 -> uid 36863 [xxxx-xx-xx xx:xx:xx, 1, pid=xxx, effective(0, 0), real(0, 0)] SID S-1-5-21-4107461286-2373360197-1842339316-6863 -> getpwuid(36863) failed [xxxx-xx-xx xx:xx:xx, 3, pid=xxx, effective(0, 0), real(0, 0)] Failed to finalize nt token [xxxx-xx-xx xx:xx:xx, 10, pid=xxx, effective(0, 0), real(0, 0)] create_local_token failed: NT_STATUS_UNSUCCESSFUL [xxxx-xx-xx xx:xx:xx, 1, pid=xxx, effective(0, 0), real(0, 0)] Failed to generate session_info (user and group token) for session setup: NT_STATUS_UNSUCCESSFUL On a hunch, I tried to strip the account from its SIDhistory, inherited from a recent AD to AD migration but the result remained the same. I also migrated from Samba 4.5.2 to 4.5.4 (Debian testing). Thanks in advance. I realize how hard it can be to harden Samba because it's impossible to test it against all the Active directory directories running in the world. Alain-Pierre Perrin
Maybe Matching Threads
- Windind (Samba 4.2.*, 4.5.2) recurring resolving failure for some specific users
- Windind (Samba 4.2.*, 4.5.2) recurring resolving failure for some specific users
- Samba as member of DC - NT_STATUS_LOGON_FAILURE
- Bug with winbindd
- Could not convert sid: NT_STATUS_NO_SUCH_USER