Andrew Bartlett
2024-Apr-02 19:28 UTC
[Samba] How to diagnose a busy LDAP server process in the Samba AD DC
1.5 seconds is pretty long, I would look into what those queries are. I would also look into repeated queries, sometimes these things are clients stuck in a loop where they don't complete because they expect some termination condition. Andrew Bartlett On Tue, 2024-04-02 at 09:25 -0300, Elias Pereira via samba wrote:> The saga continues... > I've spent a whole day with log level 5 and 7 and no error. All I > have todo is return the log to the default and the error reappears. > I monitored the "LDAP Query: Duration", but I didn't notice any > crashes inthe queries. > I don't know if it's a long time, but some queries took 1.5s. > Is there anything else I can do? > On Mon, Mar 25, 2024 at 1:30?PM Elias Pereira <empbilly at gmail.com> > wrote: > > Hello Andrew, > > What's the explanation for when the log level is set to 5, the > > errorNT_STATUS_IO_TIMEOUT doesn't appear, but when it's at the > > default loglevel, it does? > > On Mon, Mar 18, 2024 at 10:33?AM Elias Pereira <empbilly at gmail.com> > > wrote: > > > hi Andrew, thanks for the help!!! > > > It seems to me the LDAP process being busy would be the root > > > cause here. > > > > Working out what is going on here shouldn't is a detective task > > > > - I alwaysstart with a wireshark trace. The client making all > > > > the noise/traffic willbe the one causing the trouble. > > > > > > In the wireshark analysis, should I filter only by the ldap > > > protocol orleave everything? Should I look at something specific > > > in the client logs? > > > On Sun, Mar 10, 2024 at 9:31?PM Andrew Bartlett < > > > abartlet at samba.org>wrote: > > > > Thanks for getting back to me. > > > > It seems to me the LDAP process being busy would be the root > > > > causehere. Working out what is going on here shouldn't is a > > > > detective task - Ialways start with a wireshark trace. The > > > > client making all thenoise/traffic will be the one causing the > > > > trouble. > > > > If it isn't clear from that, then look into the DB audit > > > > logging forperhaps busy writes > > > > > > > > https://wiki.samba.org/index.php/Setting_up_Audit_Logging#Enabling_AD_DC_Database_Audit_Logging > > > > > > > > Finally, set 'log level = 5' and look for logs like: LDAP > > > > Query:Duration was > > > > This will tell you about how long each query is taking, > > > > potentiallyshowing a particularly slow query that needs to be > > > > stopped. > > > > Andrew Bartlett > > > > On Sun, 2024-03-10 at 19:46 -0300, Elias Pereira wrote: > > > > Is the drepl local processes very busy doing inbound > > > > replication? > > > > > > > > How can I check this? > > > > My instinct is either the server is very busy (and this should > > > > show upin CPU use) or a transaction is being held open > > > > excessively. > > > > > > > > I use VMs on Proxmox. In DC1, I installed the Proxmox agent, > > > > and CPUusage via the dashboard is very low. However, when I > > > > checked using 'top,'the LDAP process is consuming around 94/96% > > > > of the CPU. Very strange. > > > > > > > > It is probably 94% of a single CPU, but you might have 8 CPUs > > > > in the VM,so overall use is low. > > > > The VM has 4 CPUs and 6GB of memory. > > > > > > > > > > > > On Sun, Mar 10, 2024 at 5:55?PM Andrew Bartlett < > > > > abartlet at samba.org>wrote: > > > > Either the local server is busy, or possibly (but it would not > > > > explainthe samba_kcc) Samba's drepl process is stuck talking to > > > > a remote server. > > > > Is the drepl local processes very busy doing inbound > > > > replication? > > > > My instinct is either the server is very busy (and this should > > > > show upin CPU use) or a transaction is being held open > > > > excessively. > > > > Andrew Bartlett > > > > On Sat, 2024-03-09 at 19:11 -0300, Elias Pereira via samba > > > > wrote: > > > > I've been grappling with a recurring set of errors for quite > > > > some time now: > > > > - UpdateRefs failed with NT_STATUS_IO_TIMEOUT > > > > - Failed samba_kcc - NT_STATUS_IO_TIMEOUT > > > > - IRPC callback failed for DsReplicaSync - NT_STATUS_IO_TIMEOUT > > > > > > > > Despite cranking up the log level to 10, the returned > > > > information remains > > > > frustratingly cryptic and hard to decipher. > > > > > > > > This error, being overly generic, continues to elude > > > > identification even > > > > with > > > > the heightened log verbosity. The challenge lies in tracing its > > > > origin. > > > > > > > > Running samba-tool dbcheck doesn't reveal any problems, yet > > > > executing the > > > > command while monitoring the Samba log with "tail -f" exposes > > > > errors > > > > identical > > > > to those described above. > > > > > > > > Interestingly, samba-tool drs showrepl doesn't report any > > > > errors. > > > > > > > > So, what additional steps can be taken to unearth the root > > > > cause > > > > of these persistent NT_STATUS_IO_TIMEOUT errors? > > > > > > > > > > > > On Fri, Mar 1, 2024 at 10:32?PM Elias Pereira < > > > > empbilly at gmail.com > > > > > > > > > wrote: > > > > > > > > There is probably nothing wrong with your log, but Firefox > > > > doesn't > > > > like it, it thinks it contains a virus. > > > > > > > > > > > > I just saw now that your response ended up in spam, probably > > > > because of > > > > the link with the log. O.o > > > > > > > > I still receive the error in the logs: > > > > source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - > > > > NT_STATUS_IO_TIMEOUT > > > > > > > > The strangest thing is that it occurs when the command is > > > > executed: > > > > samba-tool dbcheck --cross-ncs --fix --yes > > > > > > > > Could it be some object causing this error? > > > > > > > > On Mon, Feb 12, 2024 at 4:40?PM Rowland Penny via samba < > > > > samba at lists.samba.org > > > > > > > > > wrote: > > > > > > > > On Mon, 12 Feb 2024 16:20:27 -0300 > > > > Elias Pereira via samba < > > > > samba at lists.samba.org > > > > > > > > > wrote: > > > > > > > > hi, > > > > > > > > My saga continues... > > > > > > > > I've configured the audit log for drs_repl in smb.conf, and > > > > below is > > > > the log generated. > > > > https://transfer.sh/7fen4qCNIQ/drs_repl.log > > > > > > > > > > > > > > > > The log level was 5. > > > > drs_repl:5@/var/log/samba/drs_repl.log > > > > > > > > Could someone take a look and help me understand the log? > > > > > > > > > > > > There is probably nothing wrong with your log, but Firefox > > > > doesn't > > > > like it, it thinks it contains a virus. > > > > > > > > Rowland > > > > > > > > > > > > > > > > -- > > > > To unsubscribe from this list go to the following URL and read > > > > the > > > > instructions: > > > > https://lists.samba.org/mailman/options/samba > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Elias Pereira > > > > > > > > > > > > > > > > -- > > > > Elias Pereira > > > > -- > > > > > > > > Andrew Bartlett (he/him) https://samba.org/~abartlet/ > > > > Samba Team Member (since 2001) https://samba.org > > > > Samba Team Lead > > > > https://catalyst.net.nz/services/samba > > > > Catalyst.Net Ltd > > > > Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT > > > > groupcompany > > > > Samba Development and Support: > > > > https://catalyst.net.nz/services/samba > > > > > > > > Catalyst IT - Expert Open Source Solutions > > > > > > > > > > > > > > > > --Elias Pereira > > > > -- > > > > Andrew Bartlett (he/him) https://samba.org/~abartlet/ > > > > Samba Team Member (since 2001) https://samba.org > > > > Samba Team Lead > > > > https://catalyst.net.nz/services/samba > > > > Catalyst.Net Ltd > > > > Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT > > > > groupcompany > > > > Samba Development and Support: > > > > https://catalyst.net.nz/services/samba > > > > > > > > Catalyst IT - Expert Open Source Solutions > > > > > > > > > > > > > > --Elias Pereira > > > > --Elias Pereira > > -- Elias Pereira-- Andrew Bartlett (he/him) https://samba.org/~abartlet/Samba Team Member (since 2001) https://samba.orgSamba Team Lead https://catalyst.net.nz/services/sambaCatalyst.Net Ltd Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group company Samba Development and Support: https://catalyst.net.nz/services/samba Catalyst IT - Expert Open Source Solutions
Elias Pereira
2024-Apr-11 17:21 UTC
[Samba] How to diagnose a busy LDAP server process in the Samba AD DC
Hello Andrew, 1. What is the explanation for the fact that when the log level is set to 5 or 7, the NT_STATUS_IO_TIMEOUT error does not appear, but when it is at the default log level, it does? Another point I've noticed before is that when I run the command "samba-tool dbcheck --cross-ncs --reset-well-known-acls --fix --yes" (*Checked 15337 objects (0 errors)*), and in another terminal analyze the log, some errors always occur: *source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - NT_STATUS_IO_TIMEOUT* and *IRPC callback failed for DsReplicaSync - NT_STATUS_IO_TIMEOUT* 2. Any discrepancies between the objects? Knowing that when running the command "samba-tool ldapcmp...", there are no differences between DCs. On Tue, Apr 2, 2024 at 4:28?PM Andrew Bartlett <abartlet at samba.org> wrote:> 1.5 seconds is pretty long, I would look into what those queries are. > > I would also look into repeated queries, sometimes these things are > clients stuck in a loop where they don't complete because they expect some > termination condition. > > Andrew Bartlett > > On Tue, 2024-04-02 at 09:25 -0300, Elias Pereira via samba wrote: > > The saga continues... > > > I've spent a whole day with log level 5 and 7 and no error. All I have to > > do is return the log to the default and the error reappears. > > > I monitored the "LDAP Query: Duration", but I didn't notice any crashes in > > the queries. > > > I don't know if it's a long time, but some queries took 1.5s. > > > Is there anything else I can do? > > > On Mon, Mar 25, 2024 at 1:30?PM Elias Pereira < > > empbilly at gmail.com > > > wrote: > > > Hello Andrew, > > > What's the explanation for when the log level is set to 5, the error > > NT_STATUS_IO_TIMEOUT doesn't appear, but when it's at the default log > > level, it does? > > > On Mon, Mar 18, 2024 at 10:33?AM Elias Pereira < > > empbilly at gmail.com > > > wrote: > > > hi Andrew, thanks for the help!!! > > > It seems to me the LDAP process being busy would be the root cause here. > > Working out what is going on here shouldn't is a detective task - I always > > start with a wireshark trace. The client making all the noise/traffic will > > be the one causing the trouble. > > > > In the wireshark analysis, should I filter only by the ldap protocol or > > leave everything? Should I look at something specific in the client logs? > > > On Sun, Mar 10, 2024 at 9:31?PM Andrew Bartlett < > > abartlet at samba.org > > > > > wrote: > > > Thanks for getting back to me. > > > It seems to me the LDAP process being busy would be the root cause > > here. Working out what is going on here shouldn't is a detective task - I > > always start with a wireshark trace. The client making all the > > noise/traffic will be the one causing the trouble. > > > If it isn't clear from that, then look into the DB audit logging for > > perhaps busy writes > > > > https://wiki.samba.org/index.php/Setting_up_Audit_Logging#Enabling_AD_DC_Database_Audit_Logging > > > > Finally, set 'log level = 5' and look for logs like: LDAP Query: > > Duration was > > > This will tell you about how long each query is taking, potentially > > showing a particularly slow query that needs to be stopped. > > > Andrew Bartlett > > > On Sun, 2024-03-10 at 19:46 -0300, Elias Pereira wrote: > > > Is the drepl local processes very busy doing inbound replication? > > > > How can I check this? > > > My instinct is either the server is very busy (and this should show up > > in CPU use) or a transaction is being held open excessively. > > > > I use VMs on Proxmox. In DC1, I installed the Proxmox agent, and CPU > > usage via the dashboard is very low. However, when I checked using 'top,' > > the LDAP process is consuming around 94/96% of the CPU. Very strange. > > > > It is probably 94% of a single CPU, but you might have 8 CPUs in the VM, > > so overall use is low. > > > The VM has 4 CPUs and 6GB of memory. > > > > > On Sun, Mar 10, 2024 at 5:55?PM Andrew Bartlett < > > abartlet at samba.org > > > > > wrote: > > > Either the local server is busy, or possibly (but it would not explain > > the samba_kcc) Samba's drepl process is stuck talking to a remote server. > > > Is the drepl local processes very busy doing inbound replication? > > > My instinct is either the server is very busy (and this should show up > > in CPU use) or a transaction is being held open excessively. > > > Andrew Bartlett > > > On Sat, 2024-03-09 at 19:11 -0300, Elias Pereira via samba wrote: > > > I've been grappling with a recurring set of errors for quite some time now: > > > - UpdateRefs failed with NT_STATUS_IO_TIMEOUT > > > - Failed samba_kcc - NT_STATUS_IO_TIMEOUT > > > - IRPC callback failed for DsReplicaSync - NT_STATUS_IO_TIMEOUT > > > > Despite cranking up the log level to 10, the returned information remains > > > frustratingly cryptic and hard to decipher. > > > > This error, being overly generic, continues to elude identification even > > > with > > > the heightened log verbosity. The challenge lies in tracing its origin. > > > > Running samba-tool dbcheck doesn't reveal any problems, yet executing the > > > command while monitoring the Samba log with "tail -f" exposes errors > > > identical > > > to those described above. > > > > Interestingly, samba-tool drs showrepl doesn't report any errors. > > > > So, what additional steps can be taken to unearth the root cause > > > of these persistent NT_STATUS_IO_TIMEOUT errors? > > > > > On Fri, Mar 1, 2024 at 10:32?PM Elias Pereira < > > > empbilly at gmail.com > > > > wrote: > > > > There is probably nothing wrong with your log, but Firefox doesn't > > > like it, it thinks it contains a virus. > > > > > I just saw now that your response ended up in spam, probably because of > > > the link with the log. O.o > > > > I still receive the error in the logs: > > > source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - > > > NT_STATUS_IO_TIMEOUT > > > > The strangest thing is that it occurs when the command is executed: > > > samba-tool dbcheck --cross-ncs --fix --yes > > > > Could it be some object causing this error? > > > > On Mon, Feb 12, 2024 at 4:40?PM Rowland Penny via samba < > > > samba at lists.samba.org > > > > wrote: > > > > On Mon, 12 Feb 2024 16:20:27 -0300 > > > Elias Pereira via samba < > > > samba at lists.samba.org > > > > wrote: > > > > hi, > > > > My saga continues... > > > > I've configured the audit log for drs_repl in smb.conf, and below is > > > the log generated. > > > https://transfer.sh/7fen4qCNIQ/drs_repl.log > > > > > > The log level was 5. > > > drs_repl:5@/var/log/samba/drs_repl.log > > > > Could someone take a look and help me understand the log? > > > > > There is probably nothing wrong with your log, but Firefox doesn't > > > like it, it thinks it contains a virus. > > > > Rowland > > > > > > -- > > > To unsubscribe from this list go to the following URL and read the > > > instructions: > > > https://lists.samba.org/mailman/options/samba > > > > > > > > -- > > > Elias Pereira > > > > > > -- > > > Elias Pereira > > > -- > > > > Andrew Bartlett (he/him) > > https://samba.org/~abartlet/ > > > Samba Team Member (since 2001) > > https://samba.org > > > Samba Team Lead > > https://catalyst.net.nz/services/samba > > > Catalyst.Net Ltd > > > Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group > > company > > > Samba Development and Support: > > https://catalyst.net.nz/services/samba > > > > Catalyst IT - Expert Open Source Solutions > > > > > > -- > > Elias Pereira > > > -- > > > Andrew Bartlett (he/him) > > https://samba.org/~abartlet/ > > > Samba Team Member (since 2001) > > https://samba.org > > > Samba Team Lead > > https://catalyst.net.nz/services/samba > > > Catalyst.Net Ltd > > > Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group > > company > > > Samba Development and Support: > > https://catalyst.net.nz/services/samba > > > > Catalyst IT - Expert Open Source Solutions > > > > > > -- > > Elias Pereira > > > > > -- > > Elias Pereira > > > > > -- > > Elias Pereira > > -- > > Andrew Bartlett (he/him) https://samba.org/~abartlet/ > Samba Team Member (since 2001) https://samba.org > Samba Team Lead https://catalyst.net.nz/services/samba > Catalyst.Net Ltd > > Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group company > > Samba Development and Support: https://catalyst.net.nz/services/samba > > Catalyst IT - Expert Open Source Solutions >-- Elias Pereira
Possibly Parallel Threads
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC