Bharath Bheemarasetti
2023-Jul-04 13:02 UTC
[Samba] winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
>>> Why are you using a non default port for SMB ? The defaults are 139 and 445.I not aware of the complete details but we have a dependency for the smb server to support multiple domains, so we spin up a container with samba for each of the domains and a front server routes it to the appropriate container. That is the reason for having non-default ports and binding interfaces.>>> Is this Samba server a part of a cluster ?No, it is not part of a cluster. On Tue, Jul 4, 2023 at 4:49?PM Bharath Bheemarasetti < bharath.bheemarasetti at gmail.com> wrote:> >>> What are the DC's ? > > DC is a read-write windows active directory domain controller on Windows server 2016. > > >>> Why are you using NTLMv2 ? what is it required for ? > > The smb client here is a windows server 2016 machine part of a domain and the smb server is on Ubuntu 22.04. The communication between the client and the server uses NTLMv2 while the communication between the server and the DC happens with kerberos. The error we see is when trying to mount the smb share on the client. > > >>> Did you add idmap config lines for the '<workgroup>' domain ? > > Not yet, will add that the next time we hit the issue. > > >>> Is it possible you are also using sssd or something similar ? > > No, we are not using sssd. > > > On Mon, Jul 3, 2023 at 1:10?AM Bharath Bheemarasetti < > bharath.bheemarasetti at gmail.com> wrote: > >> On further investigation, the error that shows up in packet capture is >> that the DC is returning [Fault: nca_s_fault_sec_pkg_error] for the >> NetrLogonSamLogonEx call. There are no error logs (or any logs) regarding >> the netlogon call failure in the netlogon logs even after enabling debug >> logs in the DC. One more interesting thing is restarting the netlogon.exe >> service on the DC also fixes the issue temporarily similar to restarting >> the smb service >> >> Is it possible that something is going stale in the winbindd memory/cache >> that is getting fixed on these restarts? If yes, how do I go about >> debugging that as it is not apparent from the logs? >> >> P.S: We have different setups and the frequency of this error is >> different in all of them. Also, there is another setup with Samba 4.7 on >> Ubuntu 18.04 and everything works fine there. >> >> On Fri, Jun 16, 2023 at 1:26?PM Bharath Bheemarasetti < >> bharath.bheemarasetti at gmail.com> wrote: >> >>> First 'winbind enum' lines, they can and do slow things down in large >>> domains and aren't required at all, getent etc will work without them. >>> there are some old programs that will not work without them, but when >>> was the last time you ran 'finger' for instance ? >>> >>> I made this change and it makes some difference but doesn't fix the issue entirely. Earlier the auth calls used to fail in around a day which has increased to 2 days now after which the auth calls fail with NT_STATUS_RPC_SEC_PKG_ERROR and winbind needs to be restarted for it to work. We use NTLMv2 for authentication and using the ntlm_auth tool (https://www.samba.org/samba/docs/current/man-html/ntlm_auth.1.html) returns the same NT_STATUS_RPC_SEC_PKG_ERROR error as well while wbinfo -i returns the correct user info. >>> >>> Is there anything else that can be done to fix this permanently? >>> >>> You might also want to read the smb.conf manpage, you have lots of lines >>> that I would never set. >>> >>> Thanks, I removed some lines which are not used anymore and will be cleaning up others shortly. >>> >>> >>> On Sat, Jun 3, 2023 at 1:09?PM Bharath Bheemarasetti < >>> bharath.bheemarasetti at gmail.com> wrote: >>> >>>> A couple of things possible, from 4.8.0 winbind must be running and your >>>> smb.conf is, to be blunt, rubbish. You need to set the workgroup, you >>>> need to have idmap config lines for the workgroup, the 'winbind enum' >>>> lines only slow things down and 'map untrusted to domain' has been removed. >>>> >>>> Winbind is running and the workgroup was set as well. I omitted some lines from the smb.conf shared previously as I wasn't sure if they were relevant or not. I've added the full content below. Also share is being accessed by a windows client which is part of the domain and it does work fine for a few hours after restarting the smbd and winbind services. Does 'winbind enum' have any relation to that? >>>> >>>> https://www.samba.org/samba/docs/current/man-html/smb.conf.5.html#WINBINDENUMUSERS mentions turning off 'winbind enum' can cause some problems >>>> >>>> *Configuration:* >>>> >>>> netbios name = clustF994DF >>>> realm = <domain> >>>> >>>> bind interfaces only = yes >>>> interfaces = 127.0.0.138 lo:138 >>>> >>>> workgroup = <workgroup> >>>> security = ads >>>> server role = member server >>>> >>>> auth methods = winbind >>>> >>>> idmap config * : backend = tdb >>>> idmap config * : range = 10000-24999999 >>>> >>>> winbind enum users = yes >>>> winbind enum groups = yes >>>> usershare allow guests = no >>>> >>>> map untrusted to domain = Yes >>>> allow trusted domains = no >>>> server string = %h >>>> dns proxy = no >>>> log file = /var/log/samba/log.%m >>>> max log size = 1000 >>>> panic action = /usr/share/samba/panic-action %d >>>> smb ports = 1445 >>>> pid directory = /var/run/samba >>>> >>>> server min protocol = SMB2 >>>> strict sync = yes >>>> sync always = no >>>> >>>> smb encrypt = auto >>>> >>>> aio read size = 1 >>>> aio write size = 1 >>>> >>>> smb2 max read = 1048576 >>>> smb2 max write = 1048576 >>>> smb2 max trans = 1048576 >>>> >>>> socket options = TCP_NODELAY SO_RCVBUF=10485760 SO_SNDBUF=10485760 >>>> >>>> usershare owner only = no >>>> >>>> load printers = no >>>> printing = bsd >>>> printcap name = /dev/null >>>> disable spoolss = yes >>>> >>>> machine password timeout = 0 >>>> >>>> nt acl support = yes >>>> vfs objects = acl_xattr >>>> map acl inherit = yes >>>> store dos attributes = yes >>>> >>>> log level = 5 >>>> max log size = 1000 >>>> >>>> *Share configuration:* >>>> >>>> path = <path> >>>> >>>> guest ok = no >>>> >>>> writeable = no >>>> >>>> browseable = no >>>> >>>> valid users = "<domain>\<user>","+<domain>\<user group>" >>>> >>>> force user = root >>>> >>>> On Fri, Jun 2, 2023 at 3:21?AM Bharath Bheemarasetti < >>>> bharath.bheemarasetti at gmail.com> wrote: >>>> >>>>> Hi, >>>>> I recently upgraded a smb server from Ubuntu 18.04 to Ubuntu 20.04 >>>>> which required the Samba version to be upgraded from 4.7.6 to 4.15.13. >>>>> Post the upgrade, winbind authentication fails >>>>> with NT_STATUS_RPC_SEC_PKG_ERROR intermittently. The error goes away on >>>>> restarting the smb service but comes back after some time. There were no >>>>> isses with the setup before the upgrade. >>>>> Tried clearing the cached tdb files as well but the issue still come >>>>> back after some time. >>>>> <trimmed the log lines> >>>>> >>>> >>>>> Below is the configuration: >>>>> security = ads >>>>> server role = member server >>>>> auth methods = winbind >>>>> idmap config * : backend = tdb >>>>> idmap config * : range = 10000-24999999 >>>>> winbind enum users = yes >>>>> winbind enum groups = yes >>>>> usershare allow guests = no >>>>> map untrusted to domain = Yes >>>>> allow trusted domains = no >>>>> >>>>
Rowland Penny
2023-Jul-04 19:00 UTC
[Samba] winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
On 04/07/2023 14:02, Bharath Bheemarasetti via samba wrote:>>>> Why are you using a non default port for SMB ? The defaults are 139 and 445. > > I not aware of the complete details but we have a dependency for the > smb server to support multiple domains, so we spin up a container with > samba for each of the domains and a front server routes it to the > appropriate container. That is the reason for having non-default ports > and binding interfaces. >I have never heard of this problem before, but Stefan Metzmacher has opened an MR: https://gitlab.com/samba-team/samba/-/merge_requests/3162 And there is a bug report here: https://bugzilla.samba.org/show_bug.cgi?id=15413 Could this be your problem ? Rowland
Bharath Bheemarasetti
2023-Jul-05 07:15 UTC
[Samba] winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
> I have never heard of this problem before, but Stefan Metzmacher has > opened an MR: https://gitlab.com/samba-team/samba/-/merge_requests/3162> And there is a bug report here:> https://bugzilla.samba.org/show_bug.cgi?id=15413> Could this be your problem ?Thanks, that is the exact problem we are facing. Will it be backported to 4.15? On Tue, Jul 4, 2023 at 6:32?PM Bharath Bheemarasetti < bharath.bheemarasetti at gmail.com> wrote:> >>> Why are you using a non default port for SMB ? The defaults are 139 and 445. > > I not aware of the complete details but we have a dependency for the smb server to support multiple domains, so we spin up a container with samba for each of the domains and a front server routes it to the appropriate container. That is the reason for having non-default ports and binding interfaces. > > >>> Is this Samba server a part of a cluster ? > > No, it is not part of a cluster. > > > On Tue, Jul 4, 2023 at 4:49?PM Bharath Bheemarasetti < > bharath.bheemarasetti at gmail.com> wrote: > >> >>> What are the DC's ? >> >> DC is a read-write windows active directory domain controller on Windows server 2016. >> >> >>> Why are you using NTLMv2 ? what is it required for ? >> >> The smb client here is a windows server 2016 machine part of a domain and the smb server is on Ubuntu 22.04. The communication between the client and the server uses NTLMv2 while the communication between the server and the DC happens with kerberos. The error we see is when trying to mount the smb share on the client. >> >> >>> Did you add idmap config lines for the '<workgroup>' domain ? >> >> Not yet, will add that the next time we hit the issue. >> >> >>> Is it possible you are also using sssd or something similar ? >> >> No, we are not using sssd. >> >> >> On Mon, Jul 3, 2023 at 1:10?AM Bharath Bheemarasetti < >> bharath.bheemarasetti at gmail.com> wrote: >> >>> On further investigation, the error that shows up in packet capture is >>> that the DC is returning [Fault: nca_s_fault_sec_pkg_error] for the >>> NetrLogonSamLogonEx call. There are no error logs (or any logs) regarding >>> the netlogon call failure in the netlogon logs even after enabling debug >>> logs in the DC. One more interesting thing is restarting the netlogon.exe >>> service on the DC also fixes the issue temporarily similar to restarting >>> the smb service >>> >>> Is it possible that something is going stale in the winbindd >>> memory/cache that is getting fixed on these restarts? If yes, how do I go >>> about debugging that as it is not apparent from the logs? >>> >>> P.S: We have different setups and the frequency of this error is >>> different in all of them. Also, there is another setup with Samba 4.7 on >>> Ubuntu 18.04 and everything works fine there. >>> >>> On Fri, Jun 16, 2023 at 1:26?PM Bharath Bheemarasetti < >>> bharath.bheemarasetti at gmail.com> wrote: >>> >>>> First 'winbind enum' lines, they can and do slow things down in large >>>> domains and aren't required at all, getent etc will work without them. >>>> there are some old programs that will not work without them, but when >>>> was the last time you ran 'finger' for instance ? >>>> >>>> I made this change and it makes some difference but doesn't fix the issue entirely. Earlier the auth calls used to fail in around a day which has increased to 2 days now after which the auth calls fail with NT_STATUS_RPC_SEC_PKG_ERROR and winbind needs to be restarted for it to work. We use NTLMv2 for authentication and using the ntlm_auth tool (https://www.samba.org/samba/docs/current/man-html/ntlm_auth.1.html) returns the same NT_STATUS_RPC_SEC_PKG_ERROR error as well while wbinfo -i returns the correct user info. >>>> >>>> Is there anything else that can be done to fix this permanently? >>>> >>>> You might also want to read the smb.conf manpage, you have lots of lines >>>> that I would never set. >>>> >>>> Thanks, I removed some lines which are not used anymore and will be cleaning up others shortly. >>>> >>>> >>>> On Sat, Jun 3, 2023 at 1:09?PM Bharath Bheemarasetti < >>>> bharath.bheemarasetti at gmail.com> wrote: >>>> >>>>> A couple of things possible, from 4.8.0 winbind must be running and your >>>>> smb.conf is, to be blunt, rubbish. You need to set the workgroup, you >>>>> need to have idmap config lines for the workgroup, the 'winbind enum' >>>>> lines only slow things down and 'map untrusted to domain' has been removed. >>>>> >>>>> Winbind is running and the workgroup was set as well. I omitted some lines from the smb.conf shared previously as I wasn't sure if they were relevant or not. I've added the full content below. Also share is being accessed by a windows client which is part of the domain and it does work fine for a few hours after restarting the smbd and winbind services. Does 'winbind enum' have any relation to that? >>>>> >>>>> https://www.samba.org/samba/docs/current/man-html/smb.conf.5.html#WINBINDENUMUSERS mentions turning off 'winbind enum' can cause some problems >>>>> >>>>> *Configuration:* >>>>> >>>>> netbios name = clustF994DF >>>>> realm = <domain> >>>>> >>>>> bind interfaces only = yes >>>>> interfaces = 127.0.0.138 lo:138 >>>>> >>>>> workgroup = <workgroup> >>>>> security = ads >>>>> server role = member server >>>>> >>>>> auth methods = winbind >>>>> >>>>> idmap config * : backend = tdb >>>>> idmap config * : range = 10000-24999999 >>>>> >>>>> winbind enum users = yes >>>>> winbind enum groups = yes >>>>> usershare allow guests = no >>>>> >>>>> map untrusted to domain = Yes >>>>> allow trusted domains = no >>>>> server string = %h >>>>> dns proxy = no >>>>> log file = /var/log/samba/log.%m >>>>> max log size = 1000 >>>>> panic action = /usr/share/samba/panic-action %d >>>>> smb ports = 1445 >>>>> pid directory = /var/run/samba >>>>> >>>>> server min protocol = SMB2 >>>>> strict sync = yes >>>>> sync always = no >>>>> >>>>> smb encrypt = auto >>>>> >>>>> aio read size = 1 >>>>> aio write size = 1 >>>>> >>>>> smb2 max read = 1048576 >>>>> smb2 max write = 1048576 >>>>> smb2 max trans = 1048576 >>>>> >>>>> socket options = TCP_NODELAY SO_RCVBUF=10485760 SO_SNDBUF=10485760 >>>>> >>>>> usershare owner only = no >>>>> >>>>> load printers = no >>>>> printing = bsd >>>>> printcap name = /dev/null >>>>> disable spoolss = yes >>>>> >>>>> machine password timeout = 0 >>>>> >>>>> nt acl support = yes >>>>> vfs objects = acl_xattr >>>>> map acl inherit = yes >>>>> store dos attributes = yes >>>>> >>>>> log level = 5 >>>>> max log size = 1000 >>>>> >>>>> *Share configuration:* >>>>> >>>>> path = <path> >>>>> >>>>> guest ok = no >>>>> >>>>> writeable = no >>>>> >>>>> browseable = no >>>>> >>>>> valid users = "<domain>\<user>","+<domain>\<user group>" >>>>> >>>>> force user = root >>>>> >>>>> On Fri, Jun 2, 2023 at 3:21?AM Bharath Bheemarasetti < >>>>> bharath.bheemarasetti at gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> I recently upgraded a smb server from Ubuntu 18.04 to Ubuntu 20.04 >>>>>> which required the Samba version to be upgraded from 4.7.6 to 4.15.13. >>>>>> Post the upgrade, winbind authentication fails >>>>>> with NT_STATUS_RPC_SEC_PKG_ERROR intermittently. The error goes away on >>>>>> restarting the smb service but comes back after some time. There were no >>>>>> isses with the setup before the upgrade. >>>>>> Tried clearing the cached tdb files as well but the issue still come >>>>>> back after some time. >>>>>> <trimmed the log lines> >>>>>> >>>>> >>>>>> Below is the configuration: >>>>>> security = ads >>>>>> server role = member server >>>>>> auth methods = winbind >>>>>> idmap config * : backend = tdb >>>>>> idmap config * : range = 10000-24999999 >>>>>> winbind enum users = yes >>>>>> winbind enum groups = yes >>>>>> usershare allow guests = no >>>>>> map untrusted to domain = Yes >>>>>> allow trusted domains = no >>>>>> >>>>>
Reasonably Related Threads
- winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
- winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
- winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
- winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently
- winbindd authentication fails with NT_STATUS_RPC_SEC_PKG_ERROR intermittently