I've tried searching manuals and wiki, but I can't seem to find any specifics about NEG_CONN_CACHE entries other than 'idmap negative cache time' option in smb.conf, which refers to SID/UID/GID queries and not unavailable DCs. Specifically the issue I've ran into recently is that with 'winbind max domain connections' set to 10 I saw Winbind had single active connection to the DC on port 49159 (RPC pipe from LSA/SAM/NetLogon, from what I understand), but it was trying to establish a second connection to serve incoming auth request and it was failing to do so during DC location, because at some point it tried to do NetLogon ping to both DCs (doman has only two) and for both the ping timed out, and then it put both DCs into negative cache and during DC location process was left with no candidate DCs after negative cache entry elimination from the list. So my questions are: 1. What's the default NEG_CONN_CACHE TTL? 2. Is there a way to control NEG_CONN_CACHE, either TTL or contents? Is there a way to force Winbind to try connecting to the DCs it didn't have success with before? 'net cache flush' didn't seem to have much effect. 3. Are the rules for how a DC gets put into NEG_CONN_CACHE documented anywhere besides the code itself, or wading through the code is my only option of getting to know the criteria? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part. URL: <lists.samba.org/pipermail/samba/attachments/20200527/9d468015/signature.sig>
On Wed, May 27, 2020 at 12:54:49PM -0700, Alexey A Nikitin via samba wrote:> I've tried searching manuals and wiki, but I can't seem to find any specifics about NEG_CONN_CACHE entries other than 'idmap negative cache time' option in smb.conf, which refers to SID/UID/GID queries and not unavailable DCs. > Specifically the issue I've ran into recently is that with 'winbind max domain connections' set to 10 I saw Winbind had single active connection to the DC on port 49159 (RPC pipe from LSA/SAM/NetLogon, from what I understand), but it was trying to establish a second connection to serve incoming auth request and it was failing to do so during DC location, because at some point it tried to do NetLogon ping to both DCs (doman has only two) and for both the ping timed out, and then it put both DCs into negative cache and during DC location process was left with no candidate DCs after negative cache entry elimination from the list. So my questions are: > > 1. What's the default NEG_CONN_CACHE TTL?60 seconds. #define FAILED_CONNECTION_CACHE_TIMEOUT (LONG_CONNECT_TIMEOUT * 2 / 1000) #define LONG_CONNECT_TIMEOUT 30000> 2. Is there a way to control NEG_CONN_CACHE, either TTL or contents? Is there a way to force Winbind to try connecting to the DCs it didn't have success with before? 'net cache flush' didn't seem to have much effect. > 3. Are the rules for how a DC gets put into NEG_CONN_CACHE documented anywhere besides the code itself, or wading through the code is my only option of getting to know the criteria?Only in the code I think, added in: add_failed_connection_entry() Can be cleared by: flush_negative_conn_cache_for_domain(), which is triggered by winbindd getting a request to go online.
On Wednesday, 27 May 2020 16:21:31 PDT Jeremy Allison wrote:> On Wed, May 27, 2020 at 12:54:49PM -0700, Alexey A Nikitin via samba wrote: > > 3. Are the rules for how a DC gets put into NEG_CONN_CACHE documented anywhere besides the code itself, or wading through the code is my only option of getting to know the criteria? > > Only in the code I think, added in: > > add_failed_connection_entry() > > Can be cleared by: > > flush_negative_conn_cache_for_domain(), which is triggered > by winbindd getting a request to go online. >But if winbind is configured with 'winbind offline logon = No' then, from what I understand, winbindd will never get that request, except for maybe on restart, no? Related question - it seems that when I have 'winbind max domain connections' set to a value above '1' Winbind attempts to open a new connection for incoming authentication requests, judging from the fact that it keeps trying to do DC location (but fails, because both candidate DCs are stuck in NEG_CONN_CACHE for some reason, even if they're answering request from, e.g., adcli). There is already an RPC pipe (ESTAB connection to port 49159 on DC), but Winbind seems to insist on opening a new connection and doesn't reuse existing. Am I misinterpreting something? I thought Winbind is supposed to open a new connection only when existing one is busy with some request? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part. URL: <lists.samba.org/pipermail/samba/attachments/20200528/438c4f2f/signature.sig>