Jose Santiago Oyervides
2008-Aug-01 15:47 UTC
[Samba] Samba 3.0.31 stills fails to read and write to socket.
Hi, I recently upgraded my servers from 3.0.28 to 3.0.31 trying to solve the winbind issue previously reported (Bug# 5551) but the issue is still happening in my servers. I have an ftp server (vsftpd), configured to use pam_winbind with krb5_auth and I see some random disconnects and my users cant login. My samba servers are member of a Windows 2003 domain. The relevant lines on my log.wb-OTHERDOMAIN are saying that the write to the socket failed because the connection was reset by peer, this happened also on 3.0.28, i was hoping that 3.0.31 fix this issue. Im including my configuration and my log files. This happens only when pam_winbind authenticates users of other domains, sometimes it gets fixed itself because in my krb5.conf i have configured several domain controllers for the other domains and it changes the connections to the next server, but sometimes it gets stuck with one failed server and all my users cant login for a while. Regards, Jose Santiago Oyervides. This is my setup: [global] workgroup = MYDOMAIN netbios name = MYSERVER security = ADS password server = 10.X.X.1 10.X.X.2 10.X.X.3 encrypt passwords = Yes wins server = 10.X.Y.1 10.X.Y.2 local master = no domain master = no preferred master = no log level =10 passdb:10 auth:10 winbind:10 idmap:10 smb:10 acls:10 log file = /var/log/samba/%m.log max log size = 1000 idmap uid = 10000-60000 idmap gid = 10000-60000 winbind enum users = no winbind enum groups = no winbind refresh tickets = true realm = MYDOMAIN.FORREST.COM winbind use default domain = Yes interfaces = 127.0.0.1/255.0.0.0 10.X.X.30/255.255.240.0 template shell = /bin/bash username map = /etc/samba/smbusers template homedir = /home/users/%D/%U socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 name resolve order = lmhosts wins bcast bind interfaces only = yes load printers = No dns proxy = No hosts allow = 10. 127. hosts deny = 0.0.0.0/0 smb ports = 139 My /etc/krb5.conf [logging] default = FILE:/var/log/krb5libs.log kdc = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log kdc = SYSLOG:INFO:DAEMON default = SYSLOG:INFO:DAEMON admin_server = SYSLOG:INFO:DAEMON [libdefaults] default_realm = MYDOMAIN.FORREST.COM dns_lookup_realm = none dns_lookup_kdc = none ticket_lifetime = 24h forwardable = yes [realms] FORREST.COM = { kdc=SERVER1.FORREST.COM kdc=SERVER2.FORREST.COM } MYDOMAIN.FORREST.COM= { kdc=SERVER1.MYDOMAIN.FORREST.COM kdc=SERVER2.MYDOMAIN.FORREST.COM ) OTHERDOMAIN.FORREST.COM= = { kdc=SERVER1.OTHERDOMAIN.FORREST.COM kdc=SERVER1.OTHERDOMAIN.FORREST.COM } [domain_realm] .mydomain.forrest.com = MYDOMAIN.FORREST.COM .otherdomain.forrest.com = OTHERDOMAIN.FORREST.COM /etc/nsswitch.conf passwd: files winbind shadow: files group: files winbind hosts: files wins dns winbind These are the lines that I see in log.wb-ANOTERDOMAIN: [2008/07/31 10:03:35, 3] libsmb/clikrb5.c:ads_cleanup_expired_creds(528) ads_cleanup_expired_creds: Ticket in ccache[MEMORY:winbind_ccache] expiration Thu, 31 Jul 2008 20:03:28 CDT [2008/07/31 10:03:35, 10] libsmb/clikrb5.c:ads_krb5_mk_req(624) ads_krb5_mk_req: Ticket (SERVER1$@OTHERDOMAIN.FORREST.COM) in ccache (MEMORY:winbind_ccache) is valid until: (Thu, 31 Jul 2008 20:03:28 CDT - 1217552608) [2008/07/31 10:03:35, 10] libsmb/clikrb5.c:get_krb5_smb_session_key(735) Got KRB5 session key of length 16 [2008/07/31 10:03:35, 5] libads/ldap_utils.c:ads_do_search_retry_internal(64) Search for (objectclass=*) in <> gave 1 replies [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:store_cache_seqnum(440) store_cache_seqnum: success [OTHERDOMAIN][646535412 @ 1217516615] [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(504) refresh_sequence_number: OTHERDOMAIN seq number is now 646535412 [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:centry_expired(549) centry_expired: Key U/S-1-5-21-2031228914-1097686851-784825492-55515 for domain OTHERDOMAIN expired [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:wcache_fetch(621) wcache_fetch: entry U/S-1-5-21-2031228914-1097686851-784825492-55515 expired for domain OTHERDOMAIN [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:query_user(1652) query_user: [Cached] - doing backend query for info for domain OTHERDOMAIN [2008/07/31 10:03:35, 3] nsswitch/winbindd_ads.c:query_user(453) ads: query_user [2008/07/31 10:03:35, 10] nsswitch/winbindd_ads.c:ads_cached_connection(46) ads_cached_connection [2008/07/31 10:03:35, 7] nsswitch/winbindd_ads.c:ads_cached_connection(59) Current tickets expire in 35993 seconds (at 1217552608, time is now 1217516615) [2008/07/31 10:03:35, 5] libads/ldap_utils.c:ads_do_search_retry_internal(64) Search for (objectSid=\01\05\00\00\00\00\00\05\15\00\00\00\F2\17\12\79\43\5F\6D\41\94\7C\C7\2E\DB\D8\00\00) in <dc=OTHERDOMAIN,dc=FORREST,dc=COM> gave 1 replies [2008/07/31 10:03:35, 3] nsswitch/winbindd_ads.c:query_user(499) ads query_user gave accountxyz [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(470) refresh_sequence_number: OTHERDOMAIN time ok [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(504) refresh_sequence_number: OTHERDOMAIN seq number is now 646535412 [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:wcache_save_user(872) wcache_save_user: S-1-5-21-2031228914-1097686851-784825492-55515 (acct_name accountxyz) [2008/07/31 10:03:35, 10] nsswitch/winbindd_cache.c:cache_store_response(2267) Storing response for pid 25745, len 3240 [2008/07/31 10:03:35, 4] nsswitch/winbindd_dual.c:fork_domain_child(1080) child daemon request 12 [2008/07/31 10:03:35, 10] nsswitch/winbindd_dual.c:child_process_request(479) process_request: request fn PAM_AUTH [2008/07/31 10:03:35, 3] nsswitch/winbindd_pam.c:winbindd_dual_pam_auth(1341) [ 4429]: dual pam auth OTHERDOMAIN\accountxyz [2008/07/31 10:03:35, 10] nsswitch/winbindd_pam.c:winbindd_dual_pam_auth(1364) winbindd_dual_pam_auth: domain: OTHERDOMAIN last was online [2008/07/31 10:03:35, 10] nsswitch/winbindd_pam.c:winbindd_dual_pam_auth_kerberos(1048) winbindd_dual_pam_auth_kerberos [2008/07/31 10:03:35, 8] lib/util.c:is_myname(2076) is_myname("OTHERDOMAIN") returns 0 [2008/07/31 10:03:35, 10] nsswitch/winbindd_pam.c:generate_krb5_ccache(424) using ccache: MEMORY:winbindd_pam_ccache (internal) [2008/07/31 10:03:35, 10] libads/kerberos.c:kerberos_kinit_password_ext(91) kerberos_kinit_password: using [MEMORY:winbindd_pam_ccache] as ccache and config [/var/lib/samba/smb_krb5/krb5.conf.MYDOMAIN] [2008/07/31 10:03:55, 10] nsswitch/winbindd_pam.c:winbindd_raw_kerberos_login(580) got TGT for accountXYZ@OTHERDOMAIN.FORREST.COM in MEMORY:winbindd_pam_ccache (valid until: Thu, 31 Jul 2008 20:03:57 CDT (1217552637), renewable till: Thu, 31 Jul 2008 20:03:57 CDT (1217552617)) [2008/07/31 10:04:05, 4] libsmb/clikrb5.c:ads_krb5_mk_req(610) ads_krb5_mk_req: Advancing clock by 2 seconds to cope with clock skew [2008/07/31 10:04:05, 3] libsmb/clikrb5.c:ads_cleanup_expired_creds(528) ads_cleanup_expired_creds: Ticket in ccache[MEMORY:winbindd_pam_ccache] expiration Thu, 31 Jul 2008 20:03:57 CDT [2008/07/31 10:04:05, 10] libsmb/clikrb5.c:ads_krb5_mk_req(624) ads_krb5_mk_req: Ticket (myserver$@MYDOMAIN.FORREST.COM) in ccache (MEMORY:winbindd_pam_ccache) is valid until: (Thu, 31 Jul 2008 20:03:57 CDT - 1217552637) [2008/07/31 10:04:05, 10] libsmb/clikrb5.c:get_krb5_smb_session_key(735) Got KRB5 session key of length 16 [2008/07/31 10:04:05, 10] passdb/secrets.c:secrets_named_mutex(984) secrets_named_mutex: got mutex for replay cache mutex [2008/07/31 10:04:05, 10] libads/kerberos_verify.c:ads_secrets_verify_ticket(279) ads_secrets_verify_ticket: enc type [1] failed to decrypt with error Bad encryption type [2008/07/31 10:04:05, 10] libads/kerberos_verify.c:ads_secrets_verify_ticket(279) ads_secrets_verify_ticket: enc type [3] failed to decrypt with error Bad encryption type [2008/07/31 10:04:05, 10] libads/kerberos_verify.c:ads_secrets_verify_ticket(270) ads_secrets_verify_ticket: enc type [23] decrypted message ! [2008/07/31 10:04:05, 10] passdb/secrets.c:secrets_named_mutex_release(996) secrets_named_mutex: released mutex for replay cache mutex 2008/07/31 10:04:05, 10] passdb/secrets.c:secrets_named_mutex_release(996) secrets_named_mutex: released mutex for replay cache mutex [2008/07/31 10:04:05, 10] rpc_parse/parse_net.c:dump_user_flgs(1563) dump_user_flgs account has LOGON_EXTRA_SIDS [2008/07/31 10:04:05, 10] libads/authdata.c:decode_pac_data(923) Successfully validated Kerberos PAC [2008/07/31 10:04:05, 10] libads/authdata.c:dump_pac_logon_info(735) The PAC: User Flags: 0x20 (32) User Flags: LOGON_EXTRA_SIDS 0x20 (32) User SID: S-1-5-21-2031228914-1097686851-784825492-55515 Group SID: S-1-5-21-2031228914-1097686851-784825492-513 Group Membership (Global and Universal Groups of own domain): 0: sid: S-1-5-21-2031228914-1097686851-784825492-513 attr: 0x7 == SE_GROUP_MANDATORY SE_GROUP_ENABLED_BY_DEFAULT SE_GROUP_ENABLED Group Membership (Domain Local Groups and Groups from Trusted Domains): 0: sid: S-1-5-21-1080491007-3045887122-1639496238-6307 attr: 0x20000007 == SE_GROUP_MANDATORY SE_GROUP_ENABLED_BY_DEFAULT SE_GROUP_ENABLED SE_GROUP_RESOURCE 1: sid: S-1-5-21-1080491007-3045887122-1639496238-11732 attr: 0x20000007 == SE_GROUP_MANDATORY SE_GROUP_ENABLED_BY_DEFAULT SE_GROUP_ENABLED SE_GROUP_RESOURCE Group Membership (Ressource Groups (SID History ?)): [2008/07/31 10:04:05, 10] nsswitch/winbindd_pam.c:winbindd_raw_kerberos_login(652) winbindd_raw_kerberos_login: winbindd validated ticket of myserver$@MYDOMAIN.FORREST.COM [2008/07/31 10:04:05, 10] nsswitch/winbindd_pam.c:winbindd_dual_pam_auth(1372) winbindd_dual_pam_auth_kerberos succeeded [2008/07/31 10:04:05, 10] libsmb/samlogon_cache.c:netsamlogon_cache_store(130) netsamlogon_cache_store: SID [S-1-5-21-2031228914-1097686851-784825492-55515] [2008/07/31 10:04:05, 10] libsmb/samlogon_cache.c:netsamlogon_clear_cached_user(87) netsamlogon_clear_cached_user: clearing U/S-1-5-21-2031228914-1097686851-784825492-55515 [2008/07/31 10:04:05, 10] libsmb/samlogon_cache.c:netsamlogon_clear_cached_user(95) netsamlogon_clear_cached_user: clearing UG/S-1-5-21-2031228914-1097686851-784825492-55515 [2008/07/31 10:04:05, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(470) refresh_sequence_number: OTHERDOMAIN time ok [2008/07/31 10:04:05, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(504) refresh_sequence_number: OTHERDOMAIN seq number is now 646535412 [2008/07/31 10:04:05, 10] nsswitch/winbindd_cache.c:wcache_save_name_to_sid(824) wcache_save_name_to_sid: OTHERDOMAIN\accountXYZ -> S-1-5-21-2031228914-1097686851-784825492-55515 [2008/07/31 10:04:05, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(470) refresh_sequence_number: OTHERDOMAIN time ok [2008/07/31 10:04:05, 10] nsswitch/winbindd_cache.c:refresh_sequence_number(504) refresh_sequence_number: OTHERDOMAIN seq number is now 646535412 [2008/07/31 10:04:05, 10] nsswitch/winbindd_cache.c:password_policy(2119) password_policy: [Cached] - doing backend query for info for domain OTHERDOMAIN [2008/07/31 10:04:05, 10] nsswitch/winbindd_rpc.c:msrpc_password_policy(999) rpc: fetch password policy for OTHERDOMAIN [2008/07/31 10:04:05, 6] libsmb/clientgen.c:write_socket(152) write_socket(114,100) [2008/07/31 10:04:05, 0] lib/util_sock.c:write_data(564) write_data: write failure. Error = Connection reset by peer [2008/07/31 10:04:05, 6] libsmb/clientgen.c:write_socket(155) write_socket(114,100) wrote -1 [2008/07/31 10:04:05, 0] libsmb/clientgen.c:write_socket(158) write_socket: Error writing 100 bytes to socket 114: ERRNO = Connection reset by peer [2008/07/31 10:04:05, 0] libsmb/clientgen.c:cli_send_smb(188) Error writing 100 bytes to client. -1 (Connection reset by peer) [2008/07/31 10:04:05, 1] rpc_client/cli_pipe.c:cli_rpc_pipe_open(2223) cli_rpc_pipe_open: cli_nt_create failed on pipe \samr to machine SERVER1.OTHERDOMAIN.FORREST.com. Error was Write error: Connection reset by peer [2008/07/31 10:04:05, 10] nsswitch/winbindd_cm.c:cm_connect_sam(1825) cm_connect_sam: failed to connect to SAMR pipe for domain OTHERDOMAIN using NTLMSSP authenticated pipe: user OTHERDOMAIN\MYSERVER$. Error was NT_STATUS_UNEXPECTED_NETWORK_ERROR [2008/07/31 10:04:05, 3] nsswitch/winbindd_cm.c:connection_ok(1528) connection_ok: Connection to SERVER1.OTHERDOMAIN.FORREST.com for domain OTHERDOMAIN has died or was never started (fd == -1) [2008/07/31 10:04:05, 10] lib/gencache.c:gencache_get(226) Returning valid cache entry: key = SAF/DOMAIN/OTHERDOMAIN, value 10.Y.X.99, timeout = Thu Jul 31 10:18:25 2008 [2008/07/31 10:04:05, 5] libsmb/namequery.c:saf_fetch(136) saf_fetch: Returning "10.y.x.99" for "OTHERDOMAIN" domain [2008/07/31 10:04:05, 10] nsswitch/winbindd_cm.c:cm_open_connection(1355) cm_open_connection: saf_servername is '10.y.x.99' for domain OTHERDOMAIN [2008/07/31 10:04:05, 5] libads/ldap.c:ads_try_connect(180) ads_try_connect: sending CLDAP request to 10.2.1.99 (realm: OTHERDOMAIN.FORREST.COM) [2008/07/31 10:04:05, 10] libads/dns.c:sitename_store(670) sitename_store: realm = [OTHERDOMAIN.FORREST.COM], sitename [Default-First-Site-Name], expire = [2147483647] [2008/07/31 10:04:05, 10] lib/gencache.c:gencache_set(140) Adding cache entry with key = AD_SITENAME/DOMAIN/OTHERDOMAIN.FORREST.COM; value = Default-First-Site-Name and timeout = Mon Jan 18 21:14:07 2038 (929967002 seconds ahead) [2008/07/31 10:04:05, 5] libsmb/namecache.c:namecache_store(135) namecache_store: storing 1 address for SERVER1.OTHERDOMAIN.FORREST.com#20: 10.2.1.99:0 [2008/07/31 10:04:05, 10] lib/gencache.c:gencache_set(140) Adding cache entry with key = AD_SITENAME/DOMAIN/OTHERDOMAIN.FORREST.COM; value = Default-First-Site-Name and timeout = Mon Jan 18 21:14:07 2038 (929967002 seconds ahead) [2008/07/31 10:04:05, 5] libsmb/namecache.c:namecache_store(135) namecache_store: storing 1 address for SERVER1.OTHERDOMAIN.FORREST.com#20: 10.2.1.99:0 [2008/07/31 10:04:05, 10] lib/gencache.c:gencache_set(140) Adding cache entry with key = NBT/SERVER1.OTHERDOMAIN.FORREST.COM#20; value = 10.2.1.99:0 and timeout = Thu Jul 31 10:15:05 2008 (660 seconds ahead) [2008/07/31 10:04:05, 10] nsswitch/winbindd_cm.c:dcip_to_name(1091) dcip_to_name: flags = 0x1fd [2008/07/31 10:04:05, 10] nsswitch/winbindd_cm.c:cm_open_connection(1385) cm_open_connection: dcname is 'SERVER2.OTHERDOMAIN.FORREST.com' for domain OTHERDOMAIN [2008/07/31 10:04:05, 10] lib/gencache.c:gencache_get(226) Returning valid cache entry: key = AD_SITENAME/DOMAIN/MYDOMAIN.FORREST.COM, value = Default-First-Site-Name, timeout = Mon Jan 18 21:14:07 2038 [2008/07/31 10:04:05, 5] libads/dns.c:sitename_fetch(709) sitename_fetch: Returning sitename for MYDOMAIN.FORREST.COM: "Default-First-Site-Name" [2008/07/31 10:04:05, 10] libsmb/namequery.c:internal_resolve_name(1166) internal_resolve_name: looking up SERVER2.OTHERDOMAIN.FORREST.com#20(sitename Default-First-Site-Name) [2008/07/31 10:04:05, 10] lib/gencache.c:gencache_get(226) Returning valid cache entry: key = NBT/SERVER2.OTHERDOMAIN.FORREST.COM#20, value = 10.y.x.99:0, timeout = Thu Jul 31 10:15:05 2008 [2008/07/31 10:04:05, 5] libsmb/namecache.c:namecache_fetch(214) name server2.OTHERDOMAIN.forrest.com#20 found. [2008/07/31 10:04:05, 10] nsswitch/winbindd_cm.c:cm_prepare_connection(654) cm_prepare_connection: connecting to DC SERVER2.otherdomain.forrest.comfor domain OTHERDOMAIN [2008/07/31 10:04:05, 10] passdb/secrets.c:secrets_named_mutex(984) secrets_named_mutex: got mutex for server1.OTHERDOMAIN.forrest.com [2008/07/31 10:04:05, 6] libsmb/clientgen.c:write_socket(152) write_socket(114,194) [2008/07/31 10:04:05, 6] libsmb/clientgen.c:write_socket(155) write_socket(114,194) wrote 194 [2008/07/31 10:04:05, 10] lib/util_sock.c:read_smb_length_return_keepalive(623) got smb length of 196 This is the error in authpriv facility: Jul 31 10:03:25 myserver vsftpd: pam_winbind.so(ftp:auth): [pamh: 0x089f92e0] ENTER: pam_sm_authenticate (flags: 0x0000) Jul 31 10:03:25 myserver vsftpd: pam_winbind.so(ftp:auth): getting password (0x00000081) Jul 31 10:03:25 myserver vsftpd: pam_winbind.so(ftp:auth): Verify user 'otherdomain\accountxyz' Jul 31 10:03:35 myserver vsftpd: pam_winbind.so(ftp:auth): enabling krb5 login flag Jul 31 10:04:10 myserver vsftpd: pam_winbind.so(ftp:auth): pam_winbind_request: read from socket failed! Jul 31 10:04:10 myserver vsftpd: pam_winbind.so(ftp:auth): internal module error (retval = 3, user = 'otherdomain\accountxyz') Jul 31 10:04:10 myserver vsftpd: pam_winbind.so(ftp:auth): [pamh: 0x089f92e0] LEAVE: pam_sm_authenticate returning 3
Jeremy Allison
2008-Aug-01 17:19 UTC
[Samba] Samba 3.0.31 stills fails to read and write to socket.
On Fri, Aug 01, 2008 at 10:46:54AM -0500, Jose Santiago Oyervides wrote:> Hi, > I recently upgraded my servers from 3.0.28 to 3.0.31 trying to solve the > winbind issue previously reported (Bug# 5551) but the issue is still > happening in my servers. > > I have an ftp server (vsftpd), configured to use pam_winbind with krb5_auth > and I see some random disconnects and my users cant login. My samba servers > are member of a Windows 2003 domain. > > The relevant lines on my log.wb-OTHERDOMAIN are saying that the write to the > socket failed because the connection was reset by peer, this happened also > on 3.0.28, i was hoping that 3.0.31 fix this issue. > > Im including my configuration and my log files. This happens only when > pam_winbind authenticates users of other domains, sometimes it gets fixed > itself because in my krb5.conf i have configured several domain controllers > for the other domains and it changes the connections to the next server, but > sometimes it gets stuck with one failed server and all my users cant login > for a while.This is your problem : config [/var/lib/samba/smb_krb5/krb5.conf.MYDOMAIN] [2008/07/31 10:03:55, 10] nsswitch/winbindd_pam.c:winbindd_raw_kerberos_login(580) got TGT for accountXYZ@OTHERDOMAIN.FORREST.COM in MEMORY:winbindd_pam_ccache (valid until: Thu, 31 Jul 2008 20:03:57 CDT (1217552637), renewable till: Thu, 31 Jul 2008 20:03:57 CDT (1217552617)) [2008/07/31 10:04:05, 4] libsmb/clikrb5.c:ads_krb5_mk_req(610) ads_krb5_mk_req: Advancing clock by 2 seconds to cope with clock skew Note the 30 second gap in timestamps. Looks like the call : krb5_ret = cli_krb5_get_ticket(local_service, time_offset, &tkt, &session_key_krb5, 0, cc, NULL); at line 604: in nsswitch/winbindd_pam.c is taking ages to contact a KDC. Do you have DNS resolution issues ? Jeremy.