Rowland Penny
2023-Jun-03 14:57 UTC
[Samba] Fwd: PAM Offline Authentication in Ubuntu 22.04...
On 03/06/2023 12:27, Kees van Vloten via samba wrote:> > However for a lookup with the command 'id' pam_winbind is irrelevant > since we only do a user and group lookup, it is nss-winbind that is > being used. > > I don't see any differences between our smb.conf file that could cause > this issue, do you? > > I guess the next thing to try is increase log level to a high value and > repeat the test, any other ideas? > > - Kees.When you logon it is pam_winbind that is used, from my tests there is just one line different in /var/log/auth.log between having the network connected or not at logon Without the network, I get this extra line: May 28 08:20:59 member1 lightdm: pam_winbind(lightdm:auth): User rowland logged on using cached credentials If you use gdm3, you get a similar line. By all means try a higher log level, but my gut feeling is that it is something at the OS level that is creating the wait. Rowland
Kees van Vloten
2023-Jun-03 18:12 UTC
[Samba] Fwd: PAM Offline Authentication in Ubuntu 22.04...
On 03-06-2023 16:57, Rowland Penny via samba wrote:> > > On 03/06/2023 12:27, Kees van Vloten via samba wrote: >> >> However for a lookup with the command 'id' pam_winbind is irrelevant >> since we only do a user and group lookup, it is nss-winbind that is >> being used. >> >> I don't see any differences between our smb.conf file that could >> cause this issue, do you? >> >> I guess the next thing to try is increase log level to a high value >> and repeat the test, any other ideas? >> >> - Kees. > > When you logon it is pam_winbind that is used, from my tests there is > just one line different in /var/log/auth.log between having the > network connected or not at logon > Without the network, I get this extra line: > > May 28 08:20:59 member1 lightdm: pam_winbind(lightdm:auth): User > rowland logged on using cached credentials > > If you use gdm3, you get a similar line.I understand that, however the test with 'id testuser' does not do authentication. SInce it already reproduces the issue that rules out any issues with pam_winbind. Therefore it must be an issue in nss_winbind.> > By all means try a higher log level, but my gut feeling is that it is > something at the OS level that is creating the wait. > > Rowland > >I ran both connected and disconnected 'id testuser' through strace and in smb.conf: "log level 3 winbind:10" The first 200 lines are left away since they are equal in both situation. strace id testuser connected: mprotect(0x7fa1cf90f000, 4096, PROT_READ) = 0 mprotect(0x7fa1cf925000, 4096, PROT_READ) = 0 mprotect(0x7fa1cf939000, 4096, PROT_READ) = 0 munmap(0x7fa1cff3d000, 122599)????????? = 0 getpid()??????????????????????????????? = 1961 getpid()??????????????????????????????? = 1961 getpid()??????????????????????????????? = 1961 futex(0x7fa1cfca20c8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 lstat("/run/samba/winbindd", {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0 lstat("/run/samba/winbindd/pipe", {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 socket(AF_UNIX, SOCK_STREAM, 0)???????? = 3 fcntl(3, F_GETFL)?????????????????????? = 0x2 (flags O_RDWR) fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)??? = 0 fcntl(3, F_GETFD)?????????????????????? = 0 fcntl(3, F_SETFD, FD_CLOEXEC)?????????? = 0 connect(3, {sa_family=AF_UNIX, sun_path="/run/samba/winbindd/pipe"}, 110) = 0 getpid()??????????????????????????????? = 1961 getpid()??????????????????????????????? = 1961 poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}]) write(3, "X\10\0\0\0\0\0\0\0\0\0\0\251\7\0\0\0\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2136) = 2136 getpid()??????????????????????????????? = 1961 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}]) read(3, "\250\17\0\0\2\0\0\0 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4008) = 4008 poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}]) write(3, "X\10\0\0\1\0\0\0\0\0\0\0\251\7\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2136) = 2136 getpid()??????????????????????????????? = 1961 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}]) read(3, "\250\17\0\0\2\0\0\0kvv\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4008) = 4008 openat(AT_FDCWD, "/etc/passwd", O_RDONLY|O_CLOEXEC) = 4 lseek(4, 0, SEEK_CUR)?????????????????? = 0 fstat(4, {st_mode=S_IFREG|0644, st_size=2475, ...}) = 0 read(4, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 2475 strace -tt id testuser disconnected and ctrl-c where it "hangs": 19:42:47.526437 mprotect(0x7ff2bc581000, 4096, PROT_READ) = 0 19:42:47.526499 mprotect(0x7ff2bc597000, 4096, PROT_READ) = 0 19:42:47.526548 mprotect(0x7ff2bc5ab000, 4096, PROT_READ) = 0 19:42:47.526594 munmap(0x7ff2bcbaf000, 122599) = 0 19:42:47.526660 getpid()??????????????? = 3979 19:42:47.526699 getpid()??????????????? = 3979 19:42:47.526733 getpid()??????????????? = 3979 19:42:47.526771 futex(0x7ff2bc9140c8, FUTEX_WAKE_PRIVATE, 2147483647) = 0 19:42:47.526856 lstat("/run/samba/winbindd", {st_mode=S_IFDIR|0755, st_size=60, ...}) = 0 19:42:47.526928 lstat("/run/samba/winbindd/pipe", {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0 19:42:47.526970 socket(AF_UNIX, SOCK_STREAM, 0) = 3 19:42:47.527007 fcntl(3, F_GETFL)?????? = 0x2 (flags O_RDWR) 19:42:47.527062 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 19:42:47.527127 fcntl(3, F_GETFD)?????? = 0 19:42:47.527174 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 19:42:47.527210 connect(3, {sa_family=AF_UNIX, sun_path="/run/samba/winbindd/pipe"}, 110) = 0 19:42:47.527396 getpid()??????????????? = 3979 19:42:47.527502 getpid()??????????????? = 3979 19:42:47.527553 poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}]) 19:42:47.527653 write(3, "X\10\0\0\0\0\0\0\0\0\0\0\213\17\0\0\0\10\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2136) = 2136 19:42:47.527817 getpid()??????????????? = 3979 19:42:47.527852 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN}]) 19:42:47.527895 read(3, "\250\17\0\0\2\0\0\0 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4008) = 4008 19:42:47.527938 poll([{fd=3, events=POLLIN|POLLOUT|POLLHUP}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}]) 19:42:47.527979 write(3, "X\10\0\0\1\0\0\0\0\0\0\0\213\17\0\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2136) = 2136 19:42:47.528428 getpid()??????????????? = 3979 19:42:47.528465 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout) 19:42:52.531442 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout) 19:42:57.537318 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = 0 (Timeout) 19:43:02.544018 poll([{fd=3, events=POLLIN|POLLHUP}], 1, 5000) = ? ERESTART_RESTARTBLOCK (Interrupted by signal) 19:43:02.711306 --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- 19:43:02.711867 +++ killed by SIGINT +++ log.wb-SAMDOM show the domain is reported offline before "id testuser" is started [2023/06/03 19:40:06.824113,? 0, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_cache.c:3464(set_global_winbindd_state_offline) ? set_global_winbindd_state_offline: Offline requested [2023/06/03 19:40:06.824140, 10, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_cm.c:163(set_domain_offline) ? set_domain_offline: called for domain SAMDOM [2023/06/03 19:40:06.824175, 10, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_cm.c:163(set_domain_offline) ? set_domain_offline: called for domain SAMDOM [2023/06/03 19:40:06.824205, 10, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_cache.c:1794(wcache_name_to_sid) ? wcache_name_to_sid: namemap_cache_find_name failed [2023/06/03 19:40:06.824216,? 4, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual.c:1641(child_handler) ? Finished processing child request 55 [2023/06/03 19:40:06.824222, 10, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual.c:159(child_write_response) ? Writing 4024 bytes to parent [2023/06/03 19:42:37.973329, 10, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_cm.c:3284(winbind_msg_ip_dropped) ? IP 192.168.1.108 dropped [2023/06/03 19:42:37.973344, 10, pid=2926, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_cm.c:3284(winbind_msg_ip_dropped) ? IP fe80::3f67:f5b0:e1ca:720d dropped log.winbindd-idmap: [2023/06/03 19:42:47.528286, 10, pid=2782, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual.c:124(child_read_request) ? Need to read 214 extra bytes [2023/06/03 19:42:47.528301,? 4, pid=2782, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual.c:1633(child_handler) ? child daemon request 55 [2023/06/03 19:42:47.528307, 10, pid=2782, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual_ndr.c:538(winbindd_dual_ndrcmd) ? winbindd_dual_ndrcmd: Running command wbint_GetNssInfo (domain '(null)') [2023/06/03 19:42:47.528338, 10, pid=2782, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/idmap_ad.c:532(idmap_ad_query_user) ? idmap_ad_query_user: Filter: [(objectsid=\01\05\00\00\00\00\00\05\15\00\00\00\FB'\BF\F9.lc\D8\C5\CF]y\CD\04\00\00)] [2023/06/03 19:42:47.528357, 10, pid=2782, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/idmap_ad.c:289(idmap_ad_tldap_debug) ? idmap_ad_tldap: tldap_msg_send: sending msg 29 ============ THIS IS WHERE THE TIMEOUT IN STRACE HAPPENS =========== log.winbindd [2023/06/03 19:42:47.528012,? 3, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd.c:496(process_request_send) ? process_request_send: [nss_winbind (3979)] Handling async request: GETPWNAM [2023/06/03 19:42:47.528022,? 3, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_getpwnam.c:59(winbindd_getpwnam_send) ? [nss_winbind (3979)] Winbind external command GETPWNAM start. ? Query username 'testuser'. [2023/06/03 19:42:47.528032,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_lookupname.c:52(wb_lookupname_send) ? WB command lookupname start. ? Search namespace 'SAMDOM' and domain 'SAMDOM' for name 'testuser'. [2023/06/03 19:42:47.528062, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual_ndr.c:112(wbint_bh_raw_call_send) ? wbint_bh_raw_call_send: Got opnum 3 for domain SAMDOM from cache [2023/06/03 19:42:47.528077,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_lookupname.c:118(wb_lookupname_recv) ? WB command lookupname end. ? Found SID S-1-5-21-4190054395-3630394414-2036191173-1229 with SID type 1. [2023/06/03 19:42:47.528088,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_getpwsid.c:49(wb_getpwsid_send) ? WB command getpwsid start. ? Query user SID S-1-5-21-4190054395-3630394414-2036191173-1229. [2023/06/03 19:42:47.528096,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:56(wb_queryuser_send) ? WB command queryuser start. ? Query user sid S-1-5-21-4190054395-3630394414-2036191173-1229 [2023/06/03 19:42:47.528107, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:95(wb_queryuser_idmap_setup_done) ? Convert the user SID S-1-5-21-4190054395-3630394414-2036191173-1229 to XID. [2023/06/03 19:42:47.528113,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_sids2xids.c:84(wb_sids2xids_send) ? WB command sids2xids start. ? Resolving 1 SID(s). [2023/06/03 19:42:47.528120, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_sids2xids.c:156(wb_sids2xids_send) ? 0: SID S-1-5-21-4190054395-3630394414-2036191173-1229 [2023/06/03 19:42:47.528141, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_sids2xids.c:175(wb_sids2xids_send) ? Found 1 (out of 1) SID(s) in cache. [2023/06/03 19:42:47.528152,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_sids2xids.c:775(wb_sids2xids_recv) ? WB command sids2xids end. [2023/06/03 19:42:47.528157,? 5, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_sids2xids.c:779(wb_sids2xids_recv) ? 0: Found XID 1104 for SID S-1-5-21-4190054395-3630394414-2036191173-1229 [2023/06/03 19:42:47.528163, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:134(wb_queryuser_got_uid) ? Received XID 1104 for SID S-1-5-21-4190054395-3630394414-2036191173-1229. [2023/06/03 19:42:47.528169, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:148(wb_queryuser_got_uid) [2023/06/03 19:42:47.528169, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:148(wb_queryuser_got_uid) ? Preconfigured 'Domain Users' RID 513 was used to create group SID S-1-5-21-4190054395-3630394414-2036191173-513 from user SID S-1-5-21-4190054395-3630394414-2036191173-1229. [2023/06/03 19:42:47.528176, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:154(wb_queryuser_got_uid) ? Setting 'homedir' to the template '/home/%U'. [2023/06/03 19:42:47.528181, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:160(wb_queryuser_got_uid) ? Setting 'shell' to the template '/bin/bash'. [2023/06/03 19:42:47.528206, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:167(wb_queryuser_got_uid) ? Filling data received from netsamlogon_cache ????? state->info: struct wbint_userinfo ????????? domain_name????????????? : * ????????????? domain_name????????????? : 'SAMDOM' ????????? acct_name??????????????? : * ????????????? acct_name??????????????? : 'testuser' ????????? full_name??????????????? : * ????????????? full_name??????????????? : 'Test User' ????????? homedir????????????????? : * ????????????? homedir????????????????? : '/home/%U' ????????? shell??????????????????? : * ????????????? shell??????????????????? : '/bin/bash' ????????? uid????????????????????? : 0x00000000000003ec (1104) ????????? primary_gid????????????? : 0x00000000ffffffff (4294967295) ????????? primary_group_name?????? : NULL ????????? user_sid???????????????? : S-1-5-21-4190054395-3630394414-2036191173-1229 ????????? group_sid??????????????? : S-1-5-21-4190054395-3630394414-2036191173-513 [2023/06/03 19:42:47.528248, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/wb_queryuser.c:198(wb_queryuser_got_uid) ? Domain name is set, calling dcerpc_wbint_GetNssInfo_send() ============ THIS IS WHERE THE TIMEOUT IN STRACE HAPPENS ===========[2023/06/03 19:42:56.059505, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_util.c:715(winbindd_ping_offline_domains) ? winbindd_ping_offline_domains: Domain BUILTIN is online [2023/06/03 19:42:56.059550, 10, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_util.c:720(winbindd_ping_offline_domains) ? winbindd_ping_offline_domains: We are globally offline, do nothing. [2023/06/03 19:43:02.711750,? 6, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd.c:772(winbind_client_activity) ? winbind_client_activity[3979:GETPWNAM]: client has closed connection - removing client [2023/06/03 19:43:02.711803,? 2, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd.c:884(remove_client) ? final write to client failed: Broken pipe [2023/06/03 19:43:02.711839,? 1, pid=2759, effective(0, 0), real(0, 0), class=winbind] ../../source3/winbindd/winbindd_dual.c:370(wb_child_request_cleanup) ? wb_child_request_cleanup: keep orphaned subreq[0x559ec34dd000] Unfortunately I don't see a clear correlation between the strace output and an event in the logging. I guess I have to run a strace on the winbindd pid in parallel to capture the exact thing happening in winbindd. - Kees.