On 7 November 2015 at 10:11, Rowland Penny <rowlandpenny241155 at gmail.com> wrote:> > Is it possible that sssd is failing? > What do you have in /etc/nsswitch?# cat /etc/nsswitch.conf | egrep "(passwd|group)" passwd: files sss group: files sss But I don't think this is anything to do with sssd. As I understand it: Local machine UNIX use (i.e. logging in via ssh; looking at files on disk via "ls"; etc.) uses sssd, because this is what I have set in nsswitch.conf. This all works fine, I have no problems with this. "SMB file access" (i.e. a Windows client machine elsewhere on the network, accessing resources via \\server\share\path) does not use sssd, but uses smbd + winbind/winbindd for UID resolution? This is the part that is failing intermittently.> It could be that sssd isn't running or running correctly, so it cannot get > the required info from AD, so winbind is returning the info from idmap.ldb, > hence the '3000000' numbers.Does winbind/wbinfo ever query what is defined in /etc/nsswitch.conf, or does it always use the samba internal UID resolution? I thought it would bypass nsswitch.conf entirely - hence my suspicion that this is nothing to do with sssd. It's hard to reproduce this at will - right now "wbinfo -i myuser" is returning correct UID information. The problem (as far as i can tell) is that, every so often, despite me having "idmap_ldb:use rfc2307 yes" in smb.conf, this same wbinfo command returns incorrect UID information (as also shown in "net cache list") and therefore this is why I cannot access files via smbd until I clear the idmap cache via "net cache flush". I'm trying to narrow it down to a particular set of circumstances but it's so intermittent, I'm really struggling. I would raise a bug on bugzilla but I'm not sure there's enough information here for someone familiar with the code to resolve it, yet. It is of course possible that I'm doing something wrong - but the thing that makes me convinced it's a bug is that I have /not/ changed my configuration in any way since June (when I last saw this issue). After my recent upgrade to 4.3 the problem came back - I saw it again last night - but has not reoccurred since then until now.. I really do think there is a subtle bug here. Is it worth me putting all this into a bugzilla entry, even though I haven't yet narrowed down the full circumstances under which it happens? Thanks Jonathan -- "If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein
On 07/11/15 11:31, Jonathan Hunter wrote:> On 7 November 2015 at 10:11, Rowland Penny <rowlandpenny241155 at gmail.com> wrote: >> Is it possible that sssd is failing? >> What do you have in /etc/nsswitch? > # cat /etc/nsswitch.conf | egrep "(passwd|group)" > passwd: files sss > group: files sss > > But I don't think this is anything to do with sssd. As I understand it: > > Local machine UNIX use (i.e. logging in via ssh; looking at files on > disk via "ls"; etc.) uses sssd, because this is what I have set in > nsswitch.conf. This all works fine, I have no problems with this. > > "SMB file access" (i.e. a Windows client machine elsewhere on the > network, accessing resources via \\server\share\path) does not use > sssd, but uses smbd + winbind/winbindd for UID resolution? This is the > part that is failing intermittently. > >> It could be that sssd isn't running or running correctly, so it cannot get >> the required info from AD, so winbind is returning the info from idmap.ldb, >> hence the '3000000' numbers. > Does winbind/wbinfo ever query what is defined in /etc/nsswitch.conf, > or does it always use the samba internal UID resolution? I thought it > would bypass nsswitch.conf entirely - hence my suspicion that this is > nothing to do with sssd. > > It's hard to reproduce this at will - right now "wbinfo -i myuser" is > returning correct UID information. The problem (as far as i can tell) > is that, every so often, despite me having "idmap_ldb:use rfc2307 > yes" in smb.conf, this same wbinfo command returns incorrect UID > information (as also shown in "net cache list") and therefore this is > why I cannot access files via smbd until I clear the idmap cache via > "net cache flush". > > I'm trying to narrow it down to a particular set of circumstances but > it's so intermittent, I'm really struggling. > > I would raise a bug on bugzilla but I'm not sure there's enough > information here for someone familiar with the code to resolve it, > yet. > > It is of course possible that I'm doing something wrong - but the > thing that makes me convinced it's a bug is that I have /not/ changed > my configuration in any way since June (when I last saw this issue). > After my recent upgrade to 4.3 the problem came back - I saw it again > last night - but has not reoccurred since then until now.. I really do > think there is a subtle bug here. > > Is it worth me putting all this into a bugzilla entry, even though I > haven't yet narrowed down the full circumstances under which it > happens? > > Thanks > > Jonathan >The problem is, sssd now uses its own version of winbind, this came in (I believe) with version 1.12.0 but I 'think' red-hat backport some things to earlier versions. As I understand it, you will be probably be using 1.11.6-30 and it is the '30' that says what it contains, perhaps you are using winbindd and don't realise it. Try reading the changelog for your version of sssd and/or ask sssd. If it isn't a sssd problem, then you will need to raise samba logging to 10, wait until it happens again and see if you can see anything in the logs. At this point, you can then log a bug report with something to back it up. Rowland
On 7 November 2015 at 12:37, Rowland Penny <rowlandpenny241155 at gmail.com> wrote:> On 07/11/15 11:31, Jonathan Hunter wrote: >> >> I'm trying to narrow it down to a particular set of circumstances but >> it's so intermittent, I'm really struggling. > > The problem is, sssd now uses its own version of winbind, this came in (I > believe) with version 1.12.0 but I 'think' red-hat backport some things to > earlier versions. As I understand it, you will be probably be using > 1.11.6-30 and it is the '30' that says what it contains, perhaps you are > using winbindd and don't realise it. Try reading the changelog for your > version of sssd and/or ask sssd.I'm actually on 1.12.4-47.el6 - but I'm pretty sure that this problem has nothing to do with sssd, as Samba won't be getting any information from sssd, will it? The sssd part works just fine; it's the samba internal piece (smbd/winbind?) that seems to be failing. Only connections from remote Windows clients via samba are affected. There are no problems with local UNIX authentication via sssd.> If it isn't a sssd problem, then you will need to raise samba logging to 10, > wait until it happens again and see if you can see anything in the logs. At > this point, you can then log a bug report with something to back it up.I will do that - thank you - but I imagine my logs will grow very quickly, as I may well need days or weeks of logs before something happens. Is there a sub-set of logging that I can turn on just for the UID mapping code, do you know? Otherwise I'll just set debug level 10 and perhaps move my logs onto a separate disk, with more free space... Cheers, Jonathan -- "If we knew what it was we were doing, it would not be called research, would it?" - Albert Einstein
On 2015-11-07 at 12:37 +0000, Rowland Penny wrote:> On 07/11/15 11:31, Jonathan Hunter wrote: > >On 7 November 2015 at 10:11, Rowland Penny <rowlandpenny241155 at gmail.com> wrote: > >>Is it possible that sssd is failing? > >>What do you have in /etc/nsswitch? > ># cat /etc/nsswitch.conf | egrep "(passwd|group)" > >passwd: files sss > >group: files sss > > > >But I don't think this is anything to do with sssd. As I understand it: > > > >Local machine UNIX use (i.e. logging in via ssh; looking at files on > >disk via "ls"; etc.) uses sssd, because this is what I have set in > >nsswitch.conf. This all works fine, I have no problems with this. > > > >"SMB file access" (i.e. a Windows client machine elsewhere on the > >network, accessing resources via \\server\share\path) does not use > >sssd, but uses smbd + winbind/winbindd for UID resolution? This is the > >part that is failing intermittently. > > > >>It could be that sssd isn't running or running correctly, so it cannot get > >>the required info from AD, so winbind is returning the info from idmap.ldb, > >>hence the '3000000' numbers. > >Does winbind/wbinfo ever query what is defined in /etc/nsswitch.conf, > >or does it always use the samba internal UID resolution? I thought it > >would bypass nsswitch.conf entirely - hence my suspicion that this is > >nothing to do with sssd. > > > >It's hard to reproduce this at will - right now "wbinfo -i myuser" is > >returning correct UID information. The problem (as far as i can tell) > >is that, every so often, despite me having "idmap_ldb:use rfc2307 > >yes" in smb.conf, this same wbinfo command returns incorrect UID > >information (as also shown in "net cache list") and therefore this is > >why I cannot access files via smbd until I clear the idmap cache via > >"net cache flush". > > > >I'm trying to narrow it down to a particular set of circumstances but > >it's so intermittent, I'm really struggling. > > > >I would raise a bug on bugzilla but I'm not sure there's enough > >information here for someone familiar with the code to resolve it, > >yet. > > > >It is of course possible that I'm doing something wrong - but the > >thing that makes me convinced it's a bug is that I have /not/ changed > >my configuration in any way since June (when I last saw this issue). > >After my recent upgrade to 4.3 the problem came back - I saw it again > >last night - but has not reoccurred since then until now.. I really do > >think there is a subtle bug here. > > > >Is it worth me putting all this into a bugzilla entry, even though I > >haven't yet narrowed down the full circumstances under which it > >happens? > > > >Thanks > > > >Jonathan > > > > The problem is, sssd now uses its own version of winbind, this came in (I > believe) with version 1.12.0 but I 'think' red-hat backport some things to > earlier versions. As I understand it, you will be probably be using > 1.11.6-30 and it is the '30' that says what it contains, perhaps you are > using winbindd and don't realise it.As of Version 4.2, the Samba AD/DC is using winbindd by default. It is started by the samba daemon. Samba is designed to work with winbindd. sssd does not contain its own winbind, but it implements some parts of the winbind protocol. So my suggestion is to remove sss from /etc/nsswitch.conf and use winbind instead. This is how it is designed to work. Also, for all I know, the DC always has local unix user and group IDs, and does NOT use the rfc2307 attributes for this. (Unless this has changed recently, but I can't imagine how.) So there is nothing wrong with samba not using the rfc ids on the DC -- this is how it works by design. These rfc uids/gids can be used on the member servers! Cheers - Michael -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: <http://lists.samba.org/pipermail/samba/attachments/20151107/1f1f469f/signature.sig>