Rowland Penny
2022-Feb-11 22:05 UTC
[Samba] Corruption of winbind cache after converting NT4 to AD domain
On Sat, 2022-02-12 at 00:34 +0300, Michael Tokarev via samba wrote:> Hi! > > We've been using NT4 domain with samba for many years (more than a > decade for sure), > quite successfully. And instead of fighting with it every time, we > finally decided > to convert it to AD. And with that, we faced numerous quite bad > issues, so that > our network isn't working right for over a week already. Here's one > of the issues > (more to follow). > > I created a new machine for the DC, parallel to the fileserver which > was everything > at once. Copied all configuration and data to it, and did > classicupgrade there. > Which worked fine after several attempts (we had to fix some issues, > that's ok). > > The main fileserver - I stopped it, moved everything out, leaving > just the share > definitions in conffile, and joined it to the domain (net ads join > member). Which > also went fine. and after configuring nsswitch and other stuff, it > started working. > > And immediately we faced a problem with roaming profiles - at first > windows did > everything but after a few logins/logouts it refused to syncronize > profile telling > that its owner is wrong - "Unix user mjt" instead of "DOMAIN\mjt". > > After long and painful debugging (since there's very little info > about how it all > works, which components does what and how it all should be done) it > all boiled down > to winbind cache corruption/pollution. Somewhat similar to this one: > > > https://lists.samba.org/archive/samba-technical/2019-February/132730.html > > except that in our case it is different. > > After net cache flush I lookup every uid we have with wblookup --uid- > info. > Everything's fine, every uid is looked up fine. But after some > random > time, wbinfo --uid-info start to return DOMAIN_NOT_FOUND errors to > one or > two, some more time and the amount of "not found" entries grows and > grows. > > > > > There are just selected parts of the picture, whole winbind trace > file is here: > http://www.corpit.ru/mjt/tmp/winbind.trc > > Obviously, from now on, uid 1068 does not work anymore. Over time, > more and more > uids stops working, until next `net cache flush'. > > > Now, the most "interesting" part, besides the obvious wrong behavour > somewhere. > > For a long time, we had unix users with their own regular home > directories, > shell access and lots of work in linux. As far as I can see, in > order to > use AD domain, we should convert linux users to AD, so that a user is > EITHER > in linux OR in AD, but not both. I found nothing conclusive about > this,The old way was to have a Unix user and a Samba user, this mapped Windows users to Unix users. Now, with AD, you only have one user and that user is stored in AD. Winbind maps the AD user to a Unix ID and hence makes the user a Unix user. This all means that if you have a user called 'fred' in AD and /etc/passwd , you should remove the local Unix user from /etc/passwd.> it > is just my gut feeling, - there's no direct requirement like this in > the docsThis was explained in the Samba wiki, but someone has just removed it.> I found so far. But I see that people do it like this, not mixing > uids and > usernames. It is just my gut feeling maybe I'm wrong..It is not so much that you are mixing uids and usernames, you seem to be possibly mixing users.> > So there are two parts of the question: > > First, how such setup should be done? We really used to linux auth > and linux > work, it's somewhat unnatural to rely on the AD when dealing with > local linux > accounts. But at the same time, these account should have access > from windows > to their files. And most important, _why_ this setup should be done?You should only have users in AD and 'getent passwd username' should produce output, something like this: rowland at devstation:~$ getent passwd rowland rowland:*:10000:10000:Rowland Penny:/home/rowland:/bin/bash I can assure that 'rowland' isn't in /etc/passwd> > And second, what to do with this cache corruption, how to prevent it?Setup your system correctly.> Is it > possible to perform AD auth by samba AND linux auth when logging in > to the linux > machine? Adding --no-cache to winbind command line helped, but this > obviously > is not a good solution...No, it is BAD solution.> > System info: > > samba 4.13.13+dfsg-1~deb11u2 on debian bullseye, current. > > smb.conf: > [global] > server string = %h samba server %v > netbios name = TSRV > netbios aliases = LINUX FSI do not recommend using 'netbios aliases' use a dns 'CNAME' instead.> realm = TLS.MSK.RU > workgroup = TLS > server role = member server > security = ADS > > idmap config TLS : backend = ad > idmap config TLS : range = 1000-3000 > idmap config TLS : schema_mode = rfc2307 > idmap config TLS : unix_primary_group = yes > template homedir = /home/%U > idmap config * : backend = tdb > idmap config * : range = 5000-7000 > > ...share definitions... > > Thank you for the time! It turned out to be quite a bit longer than I > expected...No problem, I await your further questions :-) Rowland
Michael Tokarev
2022-Feb-11 22:48 UTC
[Samba] Corruption of winbind cache after converting NT4 to AD domain
12.02.2022 01:05, Rowland Penny via samba ?????:> On Sat, 2022-02-12 at 00:34 +0300, Michael Tokarev via samba wrote:...>> Now, the most "interesting" part, besides the obvious wrong behavour somewhere. >> >> For a long time, we had unix users with their own regular home directories, >> shell access and lots of work in linux. As far as I can see, in order to >> use AD domain, we should convert linux users to AD, so that a user is EITHER >> in linux OR in AD, but not both. I found nothing conclusive about this, > > The old way was to have a Unix user and a Samba user, this mapped > Windows users to Unix users. Now, with AD, you only have one user and > that user is stored in AD. Winbind maps the AD user to a Unix ID and > hence makes the user a Unix user. This all means that if you have a > user called 'fred' in AD and /etc/passwd , you should remove the local > Unix user from /etc/passwd.This is very important point, Rowland. And once I started to realize it, I started wondering why it is not written in ALL BOLD in all HOWTOs and wikis out there. Because when you live in NT4-domain world, the "AD world" is VERY different in this respect, and you just don't understand it. This is, in fact, the main reason why I asked for the wiki account - to draw a summary of some sort, stating how and especially WHY things should be done and how/why they're different between NT4 and AD. Because everything tells about converting your users from NT4 to AD, but this is just a very beginning, while it seems it is all what should be done.>> it >> is just my gut feeling, - there's no direct requirement like this in >> the docs > > This was explained in the Samba wiki, but someone has just removed it.It should be in ALL BOLD. Really, I'm not joking. This concept is so much foreign to anyone who's used to NT4 or unix.. I'm oldscool, but even for many modern sysadmins this is something foreign too. That's why, I think, a wiki about AD should start from its concepts and some WHYs. I think most misunderstanding is due to this. Again, I have much experience in this area, I understand how it works (whole picture), but for ones who has less knowledge it is even more difficult, - the HOWTOs describes steps which should be done but does not add any understanding...>> I found so far. But I see that people do it like this, not mixing uids and >> usernames. It is just my gut feeling maybe I'm wrong.. > > It is not so much that you are mixing uids and usernames, you seem to > be possibly mixing users.What do you mean "mixing users"? What I want to achieve is to have one user with its home directory, files and processes, who can login to linux environment using natural linux way (either desktop or ssh or whatever), and to be able to access his home directory from windows, using windows ways to authenticate, with the help of samba AD. Roughly speaking, we have local linux users with their passwords and ssh keys, and their windows passwords are stored within AD. The uid numbers are the same, the names are the same. And I don't understand that it is bad and especially WHY it is bad. Besides bad (to my view: buggy) behavor of winbind (it should either give meaningful error message or should work, but not error out randomly with a very difficult to debug issues). It is not mixed users, - I view it is the same single user whos windows-related attributes are stored in the AD. That's it. You see - this is why it's so difficult to grok this concept even when you have strong background.>> So there are two parts of the question: >> >> First, how such setup should be done? We really used to linux auth >> and linux >> work, it's somewhat unnatural to rely on the AD when dealing with >> local linux >> accounts. But at the same time, these account should have access >> from windows >> to their files. And most important, _why_ this setup should be done? > > You should only have users in AD and 'getent passwd username' should > produce output, something like this: > > rowland at devstation:~$ getent passwd rowland > rowland:*:10000:10000:Rowland Penny:/home/rowland:/bin/bash > > I can assure that 'rowland' isn't in /etc/passwdThe main question is why. And my inside is fighting with this idea too: why should we move our lovely local users to some remote location and make our main server dependant from some other machine(s) while it already self-contained? We do have local accounts on all linux servers (sharing the same uids), - this may be difficult to administer (it's not with a bit of getting used to), but it is 100% reliable.>> And second, what to do with this cache corruption, how to prevent it? > > Setup your system correctly.Why it is "incorrect"? I just don't understand the main concept, it seems...>> smb.conf: >> [global] >> server string = %h samba server %v >> netbios name = TSRV >> netbios aliases = LINUX FS > > I do not recommend using 'netbios aliases' use a dns 'CNAME' instead.Hm. It's actually interesting. I didn't plan to mention this but we faced an issue here too. When I added a CNAME for a host, it didn't work, - neither from windows nor from smbclient, when logging in the server returned "wrong password" when connecting to //cname/foo -U foo, but worked fine when using //mainname/foo -U foo. And it didn't work until I added the above netbios aliases line and re-joined this server to the domain (net ads leave | join). Only after that clients were able to connect. It took me lots of time to figure it out. In the AD some AltName attributes appeared after the rejoin (I don't remember exactly). I didn't experiment with this further, because nothing works on my side for over a week and I need to fix _that_ first :) ..in another email, you wrote: > If you have AD, there is no point in using Samba as a standalone > server, in fact, if you later decide to join the 'standalone server' to > the domain, that is where your troubles start. This is exactly what we're doing actually: converting a standalone server to a member of a domain. And for now I had to revert it back to its original configuration with NT4-style domain in parallel with the AD. The two have the same set of SIDs, users, and especially user passwords, for now, and everything works. And my 10 or 20 attempts to join this server to the new and empty domain has failed, so we're back with two servers in parallel :)) ... > It all just works, you may need slightly different 'incatations' in the > conf files, but it all just works. I know it works.. I just need to understand the basic concept :)) > One of the benefits of this is that you can use SSH with kerberos, no > keys. So far I found that not using ssh keys is bad... :) I can manage my key pretty well on my own machine without depending on anything (eg my laptop without network connection). And I can login to many different machines worldwide with my key - to machines which are not part of our domains - without risking to expose ssh to brute-force attacks. And from what I see about kerberos, their way to store tickets isn't that good as eg. ssh-agent stores ssh keys. And things becomes even more interesting when using security tokens. But this is a different topic.>> Thank you for the time! It turned out to be quite a bit longer than I >> expected... > > No problem, I await your further questions :-)Ehh.. Actually I'm a bit too talkative sometimes ;) But I think this is some pieces of information which is not obvious for many and is difficult to understand... I definitely want to make some summary out of this. Thank you! /mjt