Alexander Spannagel
2019-Feb-23 20:54 UTC
[Samba] winbind causing huge timeouts/delays since 4.8
Am 23.02.19 um 15:48 schrieb Rowland Penny via samba:>>>>>>> If you have, as you have, 'files sss winbind' in the the passwd >>>>>>> & group line in nsswitch.conf, means this: >>>>>>> First /etc/passwd or /etc/group is searched and if the user or >>>>>>> group is found, this info is returned. >>>>>>> Next sssd will be asked, 'do you know this user or group ?' if >>>>>>> found, the info is returned. >>>>>>> Finally winbind will be asked, 'do you know this user or >>>>>>> group ?' if found, the info is returned. >>>>>>> >>>>>>> Lets take a user called 'fred', this user is in AD. The first >>>>>>> search will return nothing, so sssd is asked, this 'asks' AD and >>>>>>> returns the users info. Finally, wait that's it, we have the >>>>>>> info, there is no need to ask winbind for anything. >>>>>> >>>>>> That is incorrect. Alexander stated: >>>>>> >>>>>>> No. we use max. 3 auth providers: (1. and 2. on all unix >>>>>>> servers) 1. unix (local passwd) >>>>>>> for static OS/service accounts across all our env >>>>>>> 2. sssd (with unix ldap servers as provider) >>>>>>> unix experienced user and application related service accounts >>>>>>> 3. samba/winbind >>>>>>> for windows users/services needing access to a group of unix >>>>>>> servers >>>>>> >>>>>> And: >>>>>> >>>>>>> They don't - as stated above we use sssd for query/caching >>>>>>> entries from our ldap directory server and not Windows >>>>>>> DomainConmtrollers >>>>>>> - also this is possible, but makes more trouble and don't >>>>>>> provide what samba's smb/windbind does. >>>>>> >>>>>> He clearly writes (in multiple emails) that sssd is configured to >>>>>> use his unix ldap servers and not AD. >>>>>> >>>>>> Maybe three sources of user databases is not regular, but I fail >>>>>> to see why this should be a problem (provided that usernames, >>>>>> uidNumbers and such are unique across the databases). >>>>> >>>>> And there is the problem, if 'fred' is in /etc/passwd, that user >>>>> will be used, but what if you meant fred in ldap or AD ? >>>>>We are aware of this possible clash and it's handled during users account creation.>>>>> There is absolutely no point in having 4 databases (yes there are >>>>> 4, Unix, sssd, winbind and the ldap lines in smb.conf), they >>>>> could all be combined in AD.No it won't work as our windows team doesn't accept schema changes for unix in AD.>>>>> >>>>> The main problem is that the OP wants Samba changing to cope with >>>>> his mess, it might be a valid change, but the reason for the >>>>> change is invalid.The intial reason we hitted this after upgrade from samba-4.7 to 4.8 is a script that frequentyl checks the system for changes and a final "chown root.wheel FILE" freezes the system for approx a minute (simliar to "wbinfo -i foo"). The winbind and also sssd log showd that both were asked about a user "root.wheel" which is another question, why the notation which usually (under linux) indicates user.group and not an account with a dot in it's name - but more to glibc related. Removing winbind from nss fixed the freeze, but isn't an option. It leads to the point that asking winbind for an uknown user without domain took a long time before it returns WBC_ERR_DOMAIN_NOT_FOUND .>>>> >>>> Well, I think the problem is you _assume_ users are in multiple >>>> databases and we just don't know that. I think there is a good >>>> change Alexander perfectly knows what he is doing and users are >>>> unique across databases. >>>> >>>> Nevertheless, at some point nss is clearly querying winbind, which >>>> means nss did not find the user in either /etc/passwd nor via sssd. >>>> In the case that winbind _is_ queried, Alexander is experiencing, >>>> like he wrote, 'frequently system hangs/slowness for a couple of >>>> seconds' and he observed that winbind is causing this behaviour. >>>> >>>> So maybe we should set our focus on winbind instead of the multiple >>>> database stuff and figure out why it behaves like this since the >>>> upgrade from 4.7 to 4.8. I would say we should start with fixing >>>> the winbind stuff in smb.conf. Right? >>>> >>>> -Remy >>>> >>>> >>>> P.S. I am following this thread since I also noticed occasional >>>> 'hangs' when the system is querying winbind. This is Samba 4.8.7 on >>>> FreeBSD 11.2. >>>> >>>> >>>> >>>> >>> >>> I am quite prepared to help in getting winbind working correctly, >>> but this will require the OP changing their smb.conf considerably >>> and removing sssd. We do not support sssd, it is not a Samba >>> product (for want of a better name). Samba on a Unix domain member >>> is designed around 3 binaries, smbd, nmbd and winbind, the latter >>> can do just about anything sssd can do, so why use sssd ? Now you >>> say that I am making assumptions, well about this one, probably >>> somewhere in the mix there will be Windows domain members and the >>> users in ldap are unlikely to be known to them. >> >> I consider sssd as 'just another' user database, like /etc/passwd >> (which Samba apparently does support) and I personally cannot see any >> difference there, but I respect your opinion. >> >> Where is it documented winbind should be the only service which >> should be used with nss? If it is not documented, maybe it should. > > I am not saying that sssd shouldn't be used, just Samba does not > support it. If you want to use sssd, then do so, just don't expect to > get help with using it, we don't produce it, so don't know it. > What I will say is this, there is no need to use both on a Unix domain > member, they both do the same thing. >I'm with you and don't expect that you support sssd. On the otherside windind shouln't require to be the only one in nss-setup as i didn't ever heard, that only a certain amount can be taken into the stack. Before nss_sss we used nss_ldap alongside with nss_winbind without issues. The only interaction i could imagine if one of the libs in the stack calls the stack from beginning waiting for a certain answer ending up in a dead end. Following the functions calls of the parse_domain_user function it seems to me samba takes care about this with the flag LOOKUP_NAME_NO_NSS in the code - only an assumption as programming is not my daily business.>> >> The proof of the pudding is of course, Alexander removing sssd from >> nsswitch.conf and show us the problem still exists, or better yet, >> disappeared. > > That is what I am trying to get at, if it is a Samba problem, then it > will still be there after sssd is removed and the smb.conf is fixed. > >> >> >>> I have seen quite a few Samba setups that are like this one, bending >>> Samba to do something it isn't designed to do, you then get >>> complaints that it slow, hangs etc. Probably fixing the set up >>> would stop all these problems. >>> >>> The OP says that it is sssd that is doing the ldap lookups, yet he >>> has these in smb.conf: >>> >>> ldap connection timeout = 10 >>> ldap timeout = 30 >> >> Yep, these lines should be removed. > > Glad you agree ;-) >These were a left over from testing config changes to find a solution to the problem and reomving them so they get to it's default didn't has a real affect: [root at centos7dev64 ~]# testparm -v 2>/dev/null </dev/null|grep "ldap.*timeout" ldap connection timeout = 10 ldap timeout = 30 [root at centos7dev64 ~]# time wbinfo -i foo failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE Could not get info for user foo real 1m1.700s user 0m0.049s sys 0m0.010s [root at centos7dev64 ~]# vi /etc/samba/smb.conf [root at centos7dev64 ~]# systemctl restart smb winbind sssd ; sss_cache -E ; net cache flush [root at centos7dev64 ~]# testparm -v 2>/dev/null </dev/null|grep "ldap.*timeout" ldap connection timeout = 2 ldap timeout = 15 [root at centos7dev64 ~]# time wbinfo -i foo failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND Could not get info for user foo real 0m59.304s user 0m0.051s sys 0m0.013s Curious is also that wbinfo returns different errors for the same call: 1. failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE 2. failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND The first somewhow tells me winbind got stuck or not responding in time. The second is more the expected response as foo does not provide a domain and "winbind use default domain" is set to it's default no - not expected the time it takes to get to this finding.>> >> >>> He also has these: >>> >>> idmap config * : rangesize = 1000000 >>> idmap config * : range = 1000000-19999999 >>> idmap config * : backend = autorid >>> >>> The '*' domain is meant for the Well Known SIDs and anything outside >>> the Samba domain. I would have expected something like this: >>> >>> idmap config * : backend = tdb >>> idmap config * : range = 3000-7999 >>> idmap config OPS : backend = rid >>> idmap config OPS : range = 10000-999999 >> >> That should also be fixed. >> >>We use this as we have a multi-domain setup on windows side and this is a suggested setup from wiki.samba.org: https://wiki.samba.org/index.php/Idmap_config_autorid I'll try to somehow reconfig idmap as you suggested taking care of all the trees in our forest and will report back if that changes the situation.>>> If he wants to use Samba in a supported way, then I am more than >>> willing to help. >> >> Thanks. Now let's hope Alexander is willing to jump some hoops. > > I am not holding my breath ;-) > > Rowland > >Thanks for responding and keep breathing ;) Alex
Rowland Penny
2019-Feb-23 21:23 UTC
[Samba] winbind causing huge timeouts/delays since 4.8
On Sat, 23 Feb 2019 21:54:31 +0100 Alexander Spannagel via samba <samba at lists.samba.org> wrote:> Am 23.02.19 um 15:48 schrieb Rowland Penny via samba: > >>>>>>> If you have, as you have, 'files sss winbind' in the the > >>>>>>> passwd & group line in nsswitch.conf, means this: > >>>>>>> First /etc/passwd or /etc/group is searched and if the user or > >>>>>>> group is found, this info is returned. > >>>>>>> Next sssd will be asked, 'do you know this user or group ?' if > >>>>>>> found, the info is returned. > >>>>>>> Finally winbind will be asked, 'do you know this user or > >>>>>>> group ?' if found, the info is returned. > >>>>>>> > >>>>>>> Lets take a user called 'fred', this user is in AD. The first > >>>>>>> search will return nothing, so sssd is asked, this 'asks' AD > >>>>>>> and returns the users info. Finally, wait that's it, we have > >>>>>>> the info, there is no need to ask winbind for anything. > >>>>>> > >>>>>> That is incorrect. Alexander stated: > >>>>>> > >>>>>>> No. we use max. 3 auth providers: (1. and 2. on all unix > >>>>>>> servers) 1. unix (local passwd) > >>>>>>> for static OS/service accounts across all our env > >>>>>>> 2. sssd (with unix ldap servers as provider) > >>>>>>> unix experienced user and application related service accounts > >>>>>>> 3. samba/winbind > >>>>>>> for windows users/services needing access to a group of unix > >>>>>>> servers > >>>>>> > >>>>>> And: > >>>>>> > >>>>>>> They don't - as stated above we use sssd for query/caching > >>>>>>> entries from our ldap directory server and not Windows > >>>>>>> DomainConmtrollers > >>>>>>> - also this is possible, but makes more trouble and don't > >>>>>>> provide what samba's smb/windbind does. > >>>>>> > >>>>>> He clearly writes (in multiple emails) that sssd is configured > >>>>>> to use his unix ldap servers and not AD. > >>>>>> > >>>>>> Maybe three sources of user databases is not regular, but I > >>>>>> fail to see why this should be a problem (provided that > >>>>>> usernames, uidNumbers and such are unique across the > >>>>>> databases). > >>>>> > >>>>> And there is the problem, if 'fred' is in /etc/passwd, that user > >>>>> will be used, but what if you meant fred in ldap or AD ? > >>>>> > We are aware of this possible clash and it's handled during users > account creation.If you only used one database, you would be 100% sure you wouldn't get any clashes.> > >>>>> There is absolutely no point in having 4 databases (yes there > >>>>> are 4, Unix, sssd, winbind and the ldap lines in smb.conf), they > >>>>> could all be combined in AD. > No it won't work as our windows team doesn't accept schema changes > for unix in AD.How shall I put this, your windows team is either not telling you the entire truth or is stupid. The entire RFC2307 attributes are part of the standard Windows schema. what isn't added is IDMU and this just makes the Unix attributes tab work in ADUC. You can however use other tools to create Unix users etc.> > >>>>> > >>>>> The main problem is that the OP wants Samba changing to cope > >>>>> with his mess, it might be a valid change, but the reason for > >>>>> the change is invalid. > The intial reason we hitted this after upgrade from samba-4.7 to 4.8 > is a script that frequentyl checks the system for changes and a final > "chown root.wheel FILE" freezes the system for approx a minute > (simliar to "wbinfo -i foo"). The winbind and also sssd log showd > that both were asked about a user "root.wheel" which is another > question, why the notation which usually (under linux) indicates > user.group and not an account with a dot in it's name - but more to > glibc related. Removing winbind from nss fixed the freeze, but isn't > an option. It leads to the point that asking winbind for an uknown > user without domain took a long time before it returns > WBC_ERR_DOMAIN_NOT_FOUND .You see, there you have 2 methods checking for the same thing, also the problem you are referring to has been fixed in the latest Samba (I think it has also been backported).> > >>>> > >>>> Well, I think the problem is you _assume_ users are in multiple > >>>> databases and we just don't know that. I think there is a good > >>>> change Alexander perfectly knows what he is doing and users are > >>>> unique across databases. > >>>> > >>>> Nevertheless, at some point nss is clearly querying winbind, > >>>> which means nss did not find the user in either /etc/passwd nor > >>>> via sssd. In the case that winbind _is_ queried, Alexander is > >>>> experiencing, like he wrote, 'frequently system hangs/slowness > >>>> for a couple of seconds' and he observed that winbind is causing > >>>> this behaviour. > >>>> > >>>> So maybe we should set our focus on winbind instead of the > >>>> multiple database stuff and figure out why it behaves like this > >>>> since the upgrade from 4.7 to 4.8. I would say we should start > >>>> with fixing the winbind stuff in smb.conf. Right? > >>>> > >>>> -Remy > >>>> > >>>> > >>>> P.S. I am following this thread since I also noticed occasional > >>>> 'hangs' when the system is querying winbind. This is Samba 4.8.7 > >>>> on FreeBSD 11.2. > >>>> > >>>> > >>>> > >>>> > >>> > >>> I am quite prepared to help in getting winbind working correctly, > >>> but this will require the OP changing their smb.conf considerably > >>> and removing sssd. We do not support sssd, it is not a Samba > >>> product (for want of a better name). Samba on a Unix domain member > >>> is designed around 3 binaries, smbd, nmbd and winbind, the latter > >>> can do just about anything sssd can do, so why use sssd ? Now you > >>> say that I am making assumptions, well about this one, probably > >>> somewhere in the mix there will be Windows domain members and the > >>> users in ldap are unlikely to be known to them. > >> > >> I consider sssd as 'just another' user database, like /etc/passwd > >> (which Samba apparently does support) and I personally cannot see > >> any difference there, but I respect your opinion. > >> > >> Where is it documented winbind should be the only service which > >> should be used with nss? If it is not documented, maybe it should. > > > > I am not saying that sssd shouldn't be used, just Samba does not > > support it. If you want to use sssd, then do so, just don't expect > > to get help with using it, we don't produce it, so don't know it. > > What I will say is this, there is no need to use both on a Unix > > domain member, they both do the same thing. > > > I'm with you and don't expect that you support sssd. > On the otherside windind shouln't require to be the only one in > nss-setup as i didn't ever heard, that only a certain amount can be > taken into the stack. Before nss_sss we used nss_ldap alongside with > nss_winbind without issues.There is no real problem with using multiple methods in nsswitch.conf, there is just no real point, the first one to produce a result wins and as sssd & winbind do the same thing, you don't need both.> The only interaction i could imagine if one of the libs in the stack > calls the stack from beginning waiting for a certain answer ending up > in a dead end. > Following the functions calls of the parse_domain_user function it > seems to me samba takes care about this with the flag > LOOKUP_NAME_NO_NSS in the code - only an assumption as programming is > not my daily business. > > > >> > >> The proof of the pudding is of course, Alexander removing sssd from > >> nsswitch.conf and show us the problem still exists, or better yet, > >> disappeared. > > > > That is what I am trying to get at, if it is a Samba problem, then > > it will still be there after sssd is removed and the smb.conf is > > fixed. > > > >> > >> > >>> I have seen quite a few Samba setups that are like this one, > >>> bending Samba to do something it isn't designed to do, you then > >>> get complaints that it slow, hangs etc. Probably fixing the set up > >>> would stop all these problems. > >>> > >>> The OP says that it is sssd that is doing the ldap lookups, yet he > >>> has these in smb.conf: > >>> > >>> ldap connection timeout = 10 > >>> ldap timeout = 30 > >> > >> Yep, these lines should be removed. > > > > Glad you agree ;-) > > > These were a left over from testing config changes to find a solution > to the problem and reomving them so they get to it's default didn't > has a real affect: > [root at centos7dev64 ~]# testparm -v 2>/dev/null </dev/null|grep > "ldap.*timeout" > ldap connection timeout = 10 > ldap timeout = 30 > [root at centos7dev64 ~]# time wbinfo -i foo > failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE > Could not get info for user foo > > real 1m1.700s > user 0m0.049s > sys 0m0.010s > [root at centos7dev64 ~]# vi /etc/samba/smb.conf > [root at centos7dev64 ~]# systemctl restart smb winbind sssd ; sss_cache > -E ; net cache flush > [root at centos7dev64 ~]# testparm -v 2>/dev/null </dev/null|grep > "ldap.*timeout" > ldap connection timeout = 2 > ldap timeout = 15 > [root at centos7dev64 ~]# time wbinfo -i foo > failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND > Could not get info for user foo > > real 0m59.304s > user 0m0.051s > sys 0m0.013s > > Curious is also that wbinfo returns different errors for the same > call: 1. failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE > 2. failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUNDThis is I believe an artefact of the now fixed problem.> > The first somewhow tells me winbind got stuck or not responding in > time. The second is more the expected response as foo does not > provide a domain and "winbind use default domain" is set to it's > default no - not expected the time it takes to get to this finding. > > >> > >> > >>> He also has these: > >>> > >>> idmap config * : rangesize = 1000000 > >>> idmap config * : range = 1000000-19999999 > >>> idmap config * : backend = autorid > >>> > >>> The '*' domain is meant for the Well Known SIDs and anything > >>> outside the Samba domain. I would have expected something like > >>> this: > >>> > >>> idmap config * : backend = tdb > >>> idmap config * : range = 3000-7999 > >>> idmap config OPS : backend = rid > >>> idmap config OPS : range = 10000-999999 > >> > >> That should also be fixed. > >> > >> > We use this as we have a multi-domain setup on windows side and this > is a suggested setup from wiki.samba.org: > https://wiki.samba.org/index.php/Idmap_config_autoridCannot argue with that fact, it is there, but it also says it is meant to be used with the 'DOMAIN' domain not the '*' domain, looks like I will have to make that more prominent.> > I'll try to somehow reconfig idmap as you suggested taking care of > all the trees in our forest and will report back if that changes the > situation. > > >>> If he wants to use Samba in a supported way, then I am more than > >>> willing to help. > >> > >> Thanks. Now let's hope Alexander is willing to jump some hoops. > > > > I am not holding my breath ;-) > > > > Rowland > > > > > Thanks for responding and keep breathing ;) > > Alex >
> Am 23.02.2019 um 22:23 schrieb Rowland Penny via samba <samba at lists.samba.org>: >>>>> He also has these: >>>>> >>>>> idmap config * : rangesize = 1000000 >>>>> idmap config * : range = 1000000-19999999 >>>>> idmap config * : backend = autorid >>>>> >>>>> The '*' domain is meant for the Well Known SIDs and anything >>>>> outside the Samba domain. I would have expected something like >>>>> this: >>>>> >>>>> idmap config * : backend = tdb >>>>> idmap config * : range = 3000-7999 >>>>> idmap config OPS : backend = rid >>>>> idmap config OPS : range = 10000-999999 >>>> >>>> That should also be fixed. >>>> >>>> >> We use this as we have a multi-domain setup on windows side and this >> is a suggested setup from wiki.samba.org: >> https://wiki.samba.org/index.php/Idmap_config_autorid > > Cannot argue with that fact, it is there, but it also says it is meant > to be used with the 'DOMAIN' domain not the '*' domain, looks like I > will have to make that more prominent.idmap_autorid can be used as default domain, Alexander's idmap config is perfectly fine. -slow
Alexander Spannagel
2019-Feb-26 10:19 UTC
[Samba] winbind causing huge timeouts/delays since 4.8
Am 23.02.19 um 22:23 schrieb Rowland Penny via samba:> On Sat, 23 Feb 2019 21:54:31 +0100 > Alexander Spannagel via samba <samba at lists.samba.org> wrote: > >> Am 23.02.19 um 15:48 schrieb Rowland Penny via samba: >>>>>>>>> If you have, as you have, 'files sss winbind' in the the >>>>>>>>> passwd & group line in nsswitch.conf, means this: >>>>>>>>> First /etc/passwd or /etc/group is searched and if the user or >>>>>>>>> group is found, this info is returned. >>>>>>>>> Next sssd will be asked, 'do you know this user or group ?' if >>>>>>>>> found, the info is returned. >>>>>>>>> Finally winbind will be asked, 'do you know this user or >>>>>>>>> group ?' if found, the info is returned. >>>>>>>>> >>>>>>>>> Lets take a user called 'fred', this user is in AD. The first >>>>>>>>> search will return nothing, so sssd is asked, this 'asks' AD >>>>>>>>> and returns the users info. Finally, wait that's it, we have >>>>>>>>> the info, there is no need to ask winbind for anything. >>>>>>>>who does what. >>>>>>>> That is incorrect. Alexander stated: >>>>>>>> >>>>>>>>> No. we use max. 3 auth providers: (1. and 2. on all unix >>>>>>>>> servers) 1. unix (local passwd) >>>>>>>>> for static OS/service accounts across all our env >>>>>>>>> 2. sssd (with unix ldap servers as provider) >>>>>>>>> unix experienced user and application related service accounts >>>>>>>>> 3. samba/winbind >>>>>>>>> for windows users/services needing access to a group of unix >>>>>>>>> servers >>>>>>>> >>>>>>>> And: >>>>>>>> >>>>>>>>> They don't - as stated above we use sssd for query/caching >>>>>>>>> entries from our ldap directory server and not Windows >>>>>>>>> DomainConmtrollers >>>>>>>>> - also this is possible, but makes more trouble and don't >>>>>>>>> provide what samba's smb/windbind does. >>>>>>>> >>>>>>>> He clearly writes (in multiple emails) that sssd is configured >>>>>>>> to use his unix ldap servers and not AD. >>>>>>>> >>>>>>>> Maybe three sources of user databases is not regular, but I >>>>>>>> fail to see why this should be a problem (provided that >>>>>>>> usernames, uidNumbers and such are unique across the >>>>>>>> databases). >>>>>>> >>>>>>> And there is the problem, if 'fred' is in /etc/passwd, that user >>>>>>> will be used, but what if you meant fred in ldap or AD ? >>>>>>> >> We are aware of this possible clash and it's handled during users >> account creation. > > If you only used one database, you would be 100% sure you wouldn't get > any clashes. >Agreed, but not feasible in our company as there are - for good reasons - two big server farms spreading across multiple countries. One is based on Linux and the other on Windows. The glue between those two farms is build upon samba due to it's flexibility, strength and reliability.>> >>>>>>> There is absolutely no point in having 4 databases (yes there >>>>>>> are 4, Unix, sssd, winbind and the ldap lines in smb.conf), they >>>>>>> could all be combined in AD. >> No it won't work as our windows team doesn't accept schema changes >> for unix in AD. > > How shall I put this, your windows team is either not telling you the > entire truth or is stupid. The entire RFC2307 attributes are part of > the standard Windows schema. what isn't added is IDMU and this just > makes the Unix attributes tab work in ADUC. You can however use other > tools to create Unix users etc. >Beside what i mentioned earlier you are rigth that the RFC2307 attributes are still around, but reading a blog entry from 2016 in technet it's cleraly stated that one should look on other alternatives as it may go in future realease. Here the comment: "I am using Windows Server IDMU/NIS Server role today, what should I do? We recommend to start planning for alternatives, for example: native LDAP, Samba Client, Kerberos or other non-Microsoft options. Existing Windows Server 2012 R2 or earlier deployments will continue to be supported in accordance with the Microsoft Support lifecycle." Taken out of this blog entry: https://blogs.technet.microsoft.com/activedirectoryua/2016/02/09/identity-management-for-unix-idmu-is-deprecated-in-windows-server/>> >>>>>>> >>>>>>> The main problem is that the OP wants Samba changing to cope >>>>>>> with his mess, it might be a valid change, but the reason for >>>>>>> the change is invalid. >> The intial reason we hitted this after upgrade from samba-4.7 to 4.8 >> is a script that frequentyl checks the system for changes and a final >> "chown root.wheel FILE" freezes the system for approx a minute >> (simliar to "wbinfo -i foo"). The winbind and also sssd log showd >> that both were asked about a user "root.wheel" which is another >> question, why the notation which usually (under linux) indicates >> user.group and not an account with a dot in it's name - but more to >> glibc related. Removing winbind from nss fixed the freeze, but isn't >> an option. It leads to the point that asking winbind for an uknown >> user without domain took a long time before it returns >> WBC_ERR_DOMAIN_NOT_FOUND . > > You see, there you have 2 methods checking for the same thing, also the > problem you are referring to has been fixed in the latest Samba (I > think it has also been backported). > >> >>>>>> >>>>>> Well, I think the problem is you _assume_ users are in multiple >>>>>> databases and we just don't know that. I think there is a good >>>>>> change Alexander perfectly knows what he is doing and users are >>>>>> unique across databases. >>>>>> >>>>>> Nevertheless, at some point nss is clearly querying winbind, >>>>>> which means nss did not find the user in either /etc/passwd nor >>>>>> via sssd. In the case that winbind _is_ queried, Alexander is >>>>>> experiencing, like he wrote, 'frequently system hangs/slowness >>>>>> for a couple of seconds' and he observed that winbind is causing >>>>>> this behaviour. >>>>>> >>>>>> So maybe we should set our focus on winbind instead of the >>>>>> multiple database stuff and figure out why it behaves like this >>>>>> since the upgrade from 4.7 to 4.8. I would say we should start >>>>>> with fixing the winbind stuff in smb.conf. Right? >>>>>> >>>>>> -Remy >>>>>> >>>>>> >>>>>> P.S. I am following this thread since I also noticed occasional >>>>>> 'hangs' when the system is querying winbind. This is Samba 4.8.7 >>>>>> on FreeBSD 11.2.As the function - that i suggested patches for - hasn't changed sinceintroduced patch for Bug 13503 "getpwnam resolves local system accounts to AD accounts" with 4.8.4 it may be the same problem. As mentioned earlier removing patch for Bug 13503 also resolves seen problems.>>>>> >>>>> I am quite prepared to help in getting winbind working correctly, >>>>> but this will require the OP changing their smb.conf considerably >>>>> and removing sssd. We do not support sssd, it is not a Samba >>>>> product (for want of a better name). Samba on a Unix domain member >>>>> is designed around 3 binaries, smbd, nmbd and winbind, the latter >>>>> can do just about anything sssd can do, so why use sssd ? Now you >>>>> say that I am making assumptions, well about this one, probably >>>>> somewhere in the mix there will be Windows domain members and the >>>>> users in ldap are unlikely to be known to them. >>>> >>>> I consider sssd as 'just another' user database, like /etc/passwd >>>> (which Samba apparently does support) and I personally cannot see >>>> any difference there, but I respect your opinion. >>>> >>>> Where is it documented winbind should be the only service which >>>> should be used with nss? If it is not documented, maybe it should. >>> >>> I am not saying that sssd shouldn't be used, just Samba does not >>> support it. If you want to use sssd, then do so, just don't expect >>> to get help with using it, we don't produce it, so don't know it. >>> What I will say is this, there is no need to use both on a Unix >>> domain member, they both do the same thing. >>> >> I'm with you and don't expect that you support sssd. >> On the otherside windind shouln't require to be the only one in >> nss-setup as i didn't ever heard, that only a certain amount can be >> taken into the stack. Before nss_sss we used nss_ldap alongside with >> nss_winbind without issues. > > There is no real problem with using multiple methods in nsswitch.conf, > there is just no real point, the first one to produce a result wins > and as sssd & winbind do the same thing, you don't need both. >They don't and shouldn't provide same thing in our setup. The main jobs in the unix farm can and should be processed while the windows farm may in trouble and vice versa. The huge delays are seen, when user isn't known to sssd and winbind tries to look that user without explicitly a domain given and the option "winbind use default domain" is on it's default of "No" in smb.conf.>>>>> He also has these: >>>>> >>>>> idmap config * : rangesize = 1000000 >>>>> idmap config * : range = 1000000-19999999 >>>>> idmap config * : backend = autorid >>>>> >>>>> The '*' domain is meant for the Well Known SIDs and anything >>>>> outside the Samba domain. I would have expected something like >>>>> this: >>>>> >>>>> idmap config * : backend = tdb >>>>> idmap config * : range = 3000-7999 >>>>> idmap config OPS : backend = rid >>>>> idmap config OPS : range = 10000-999999 >>>> >>>> That should also be fixed. >>>> >>>> >> We use this as we have a multi-domain setup on windows side and this >> is a suggested setup from wiki.samba.org: >> https://wiki.samba.org/index.php/Idmap_config_autorid > > Cannot argue with that fact, it is there, but it also says it is meant > to be used with the 'DOMAIN' domain not the '*' domain, looks like I > will have to make that more prominent. > >> >> I'll try to somehow reconfig idmap as you suggested taking care of >> all the trees in our forest and will report back if that changes the >> situation. >>I reconfigured idmap config auccesfuly like this: idmap config * : range = 3000-7999 idmap config * : backend = tdb idmap config ops : range = 1000000-1999999 idmap config ops : backend = rid idmap config ops2 : range = 2000000-2999999 idmap config ops2 : backend = rid ... But it didn't change the problem with the delayed response or timeout of winbind calling "wbinfo -i foo". One good point about the suggestion - It point to a fix of a "problem" seen with autorid: The used ID ranges from the pool used for different domains seem to vary for servers being member of one of the other domains. Changing from autorid to rid looks like the way to fix that as we can configure dedicated ranges from the pool per domain and so uid/gid resloution will become consitant across all of our domains. :) I did some more tested about nsswitch.conf: 1. removing sss -> no more hangs (independand of shadow configured with winbind or not) passwd: files winbind shadow: files # winbind group: files winbind 2. removing winbind -> no more hangs passwd: files sss shadow: files sss group: files sss 3. exchange winbind/sss - no change, still delays passwd: files winbind sss shadow: files sss group: files winbind sss It still looks that either sssd or winbindd (or both) relays/calls each other, when no domain is given. I will try to get debug logging correlated to see if and if yes which service calls the other. Alex