Elias Pereira
2024-Mar-25 16:30 UTC
[Samba] How to diagnose a busy LDAP server process in the Samba AD DC
Hello Andrew, What's the explanation for when the log level is set to 5, the error NT_STATUS_IO_TIMEOUT doesn't appear, but when it's at the default log level, it does? On Mon, Mar 18, 2024 at 10:33?AM Elias Pereira <empbilly at gmail.com> wrote:> hi Andrew, thanks for the help!!! > > It seems to me the LDAP process being busy would be the root cause here. >> Working out what is going on here shouldn't is a detective task - I always >> start with a wireshark trace. The client making all the noise/traffic will >> be the one causing the trouble. > > > In the wireshark analysis, should I filter only by the ldap protocol or > leave everything? Should I look at something specific in the client logs? > > On Sun, Mar 10, 2024 at 9:31?PM Andrew Bartlett <abartlet at samba.org> > wrote: > >> Thanks for getting back to me. >> >> It seems to me the LDAP process being busy would be the root cause here. >> Working out what is going on here shouldn't is a detective task - I always >> start with a wireshark trace. The client making all the noise/traffic will >> be the one causing the trouble. >> >> If it isn't clear from that, then look into the DB audit logging for >> perhaps busy writes >> >> >> https://wiki.samba.org/index.php/Setting_up_Audit_Logging#Enabling_AD_DC_Database_Audit_Logging >> >> Finally, set 'log level = 5' and look for logs like: LDAP Query: >> Duration was >> >> This will tell you about how long each query is taking, potentially >> showing a particularly slow query that needs to be stopped. >> >> Andrew Bartlett >> >> On Sun, 2024-03-10 at 19:46 -0300, Elias Pereira wrote: >> >> Is the drepl local processes very busy doing inbound replication? >> >> >> How can I check this? >> >> My instinct is either the server is very busy (and this should show up in >> CPU use) or a transaction is being held open excessively. >> >> >> I use VMs on Proxmox. In DC1, I installed the Proxmox agent, and CPU >> usage via the dashboard is very low. However, when I checked using 'top,' >> the LDAP process is consuming around 94/96% of the CPU. Very strange. >> >> >> It is probably 94% of a single CPU, but you might have 8 CPUs in the VM, >> so overall use is low. >> >> The VM has 4 CPUs and 6GB of memory. >> >> >> >> On Sun, Mar 10, 2024 at 5:55?PM Andrew Bartlett <abartlet at samba.org> >> wrote: >> >> Either the local server is busy, or possibly (but it would not explain >> the samba_kcc) Samba's drepl process is stuck talking to a remote server. >> >> Is the drepl local processes very busy doing inbound replication? >> >> My instinct is either the server is very busy (and this should show up in >> CPU use) or a transaction is being held open excessively. >> >> Andrew Bartlett >> >> On Sat, 2024-03-09 at 19:11 -0300, Elias Pereira via samba wrote: >> >> I've been grappling with a recurring set of errors for quite some time now: >> >> - UpdateRefs failed with NT_STATUS_IO_TIMEOUT >> >> - Failed samba_kcc - NT_STATUS_IO_TIMEOUT >> >> - IRPC callback failed for DsReplicaSync - NT_STATUS_IO_TIMEOUT >> >> >> Despite cranking up the log level to 10, the returned information remains >> >> frustratingly cryptic and hard to decipher. >> >> >> This error, being overly generic, continues to elude identification even >> >> with >> >> the heightened log verbosity. The challenge lies in tracing its origin. >> >> >> Running samba-tool dbcheck doesn't reveal any problems, yet executing the >> >> command while monitoring the Samba log with "tail -f" exposes errors >> >> identical >> >> to those described above. >> >> >> Interestingly, samba-tool drs showrepl doesn't report any errors. >> >> >> So, what additional steps can be taken to unearth the root cause >> >> of these persistent NT_STATUS_IO_TIMEOUT errors? >> >> >> >> On Fri, Mar 1, 2024 at 10:32?PM Elias Pereira < >> >> empbilly at gmail.com >> >> > wrote: >> >> >> There is probably nothing wrong with your log, but Firefox doesn't >> >> like it, it thinks it contains a virus. >> >> >> >> I just saw now that your response ended up in spam, probably because of >> >> the link with the log. O.o >> >> >> I still receive the error in the logs: >> >> source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - >> >> NT_STATUS_IO_TIMEOUT >> >> >> The strangest thing is that it occurs when the command is executed: >> >> samba-tool dbcheck --cross-ncs --fix --yes >> >> >> Could it be some object causing this error? >> >> >> On Mon, Feb 12, 2024 at 4:40?PM Rowland Penny via samba < >> >> samba at lists.samba.org >> >> > wrote: >> >> >> On Mon, 12 Feb 2024 16:20:27 -0300 >> >> Elias Pereira via samba < >> >> samba at lists.samba.org >> >> > wrote: >> >> >> hi, >> >> >> My saga continues... >> >> >> I've configured the audit log for drs_repl in smb.conf, and below is >> >> the log generated. >> >> https://transfer.sh/7fen4qCNIQ/drs_repl.log >> >> >> >> The log level was 5. >> >> drs_repl:5@/var/log/samba/drs_repl.log >> >> >> Could someone take a look and help me understand the log? >> >> >> >> There is probably nothing wrong with your log, but Firefox doesn't >> >> like it, it thinks it contains a virus. >> >> >> Rowland >> >> >> >> >> -- >> >> To unsubscribe from this list go to the following URL and read the >> >> instructions: >> >> https://lists.samba.org/mailman/options/samba >> >> >> >> >> >> -- >> >> Elias Pereira >> >> >> >> >> -- >> >> Elias Pereira >> >> -- >> >> >> Andrew Bartlett (he/him) https://samba.org/~abartlet/ >> Samba Team Member (since 2001) https://samba.org >> Samba Team Lead https://catalyst.net.nz/services/samba >> Catalyst.Net Ltd >> >> Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group >> company >> >> Samba Development and Support: https://catalyst.net.nz/services/samba >> >> Catalyst IT - Expert Open Source Solutions >> >> >> >> >> -- >> Elias Pereira >> >> -- >> >> Andrew Bartlett (he/him) https://samba.org/~abartlet/ >> Samba Team Member (since 2001) https://samba.org >> Samba Team Lead https://catalyst.net.nz/services/samba >> Catalyst.Net Ltd >> >> Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group >> company >> >> Samba Development and Support: https://catalyst.net.nz/services/samba >> >> Catalyst IT - Expert Open Source Solutions >> >> >> > > -- > Elias Pereira >-- Elias Pereira
Elias Pereira
2024-Apr-02 12:25 UTC
[Samba] How to diagnose a busy LDAP server process in the Samba AD DC
The saga continues... I've spent a whole day with log level 5 and 7 and no error. All I have to do is return the log to the default and the error reappears. I monitored the "LDAP Query: Duration", but I didn't notice any crashes in the queries. I don't know if it's a long time, but some queries took 1.5s. Is there anything else I can do? On Mon, Mar 25, 2024 at 1:30?PM Elias Pereira <empbilly at gmail.com> wrote:> Hello Andrew, > > What's the explanation for when the log level is set to 5, the error > NT_STATUS_IO_TIMEOUT doesn't appear, but when it's at the default log > level, it does? > > On Mon, Mar 18, 2024 at 10:33?AM Elias Pereira <empbilly at gmail.com> wrote: > >> hi Andrew, thanks for the help!!! >> >> It seems to me the LDAP process being busy would be the root cause here. >>> Working out what is going on here shouldn't is a detective task - I always >>> start with a wireshark trace. The client making all the noise/traffic will >>> be the one causing the trouble. >> >> >> In the wireshark analysis, should I filter only by the ldap protocol or >> leave everything? Should I look at something specific in the client logs? >> >> On Sun, Mar 10, 2024 at 9:31?PM Andrew Bartlett <abartlet at samba.org> >> wrote: >> >>> Thanks for getting back to me. >>> >>> It seems to me the LDAP process being busy would be the root cause >>> here. Working out what is going on here shouldn't is a detective task - I >>> always start with a wireshark trace. The client making all the >>> noise/traffic will be the one causing the trouble. >>> >>> If it isn't clear from that, then look into the DB audit logging for >>> perhaps busy writes >>> >>> >>> https://wiki.samba.org/index.php/Setting_up_Audit_Logging#Enabling_AD_DC_Database_Audit_Logging >>> >>> Finally, set 'log level = 5' and look for logs like: LDAP Query: >>> Duration was >>> >>> This will tell you about how long each query is taking, potentially >>> showing a particularly slow query that needs to be stopped. >>> >>> Andrew Bartlett >>> >>> On Sun, 2024-03-10 at 19:46 -0300, Elias Pereira wrote: >>> >>> Is the drepl local processes very busy doing inbound replication? >>> >>> >>> How can I check this? >>> >>> My instinct is either the server is very busy (and this should show up >>> in CPU use) or a transaction is being held open excessively. >>> >>> >>> I use VMs on Proxmox. In DC1, I installed the Proxmox agent, and CPU >>> usage via the dashboard is very low. However, when I checked using 'top,' >>> the LDAP process is consuming around 94/96% of the CPU. Very strange. >>> >>> >>> It is probably 94% of a single CPU, but you might have 8 CPUs in the VM, >>> so overall use is low. >>> >>> The VM has 4 CPUs and 6GB of memory. >>> >>> >>> >>> On Sun, Mar 10, 2024 at 5:55?PM Andrew Bartlett <abartlet at samba.org> >>> wrote: >>> >>> Either the local server is busy, or possibly (but it would not explain >>> the samba_kcc) Samba's drepl process is stuck talking to a remote server. >>> >>> Is the drepl local processes very busy doing inbound replication? >>> >>> My instinct is either the server is very busy (and this should show up >>> in CPU use) or a transaction is being held open excessively. >>> >>> Andrew Bartlett >>> >>> On Sat, 2024-03-09 at 19:11 -0300, Elias Pereira via samba wrote: >>> >>> I've been grappling with a recurring set of errors for quite some time now: >>> >>> - UpdateRefs failed with NT_STATUS_IO_TIMEOUT >>> >>> - Failed samba_kcc - NT_STATUS_IO_TIMEOUT >>> >>> - IRPC callback failed for DsReplicaSync - NT_STATUS_IO_TIMEOUT >>> >>> >>> Despite cranking up the log level to 10, the returned information remains >>> >>> frustratingly cryptic and hard to decipher. >>> >>> >>> This error, being overly generic, continues to elude identification even >>> >>> with >>> >>> the heightened log verbosity. The challenge lies in tracing its origin. >>> >>> >>> Running samba-tool dbcheck doesn't reveal any problems, yet executing the >>> >>> command while monitoring the Samba log with "tail -f" exposes errors >>> >>> identical >>> >>> to those described above. >>> >>> >>> Interestingly, samba-tool drs showrepl doesn't report any errors. >>> >>> >>> So, what additional steps can be taken to unearth the root cause >>> >>> of these persistent NT_STATUS_IO_TIMEOUT errors? >>> >>> >>> >>> On Fri, Mar 1, 2024 at 10:32?PM Elias Pereira < >>> >>> empbilly at gmail.com >>> >>> > wrote: >>> >>> >>> There is probably nothing wrong with your log, but Firefox doesn't >>> >>> like it, it thinks it contains a virus. >>> >>> >>> >>> I just saw now that your response ended up in spam, probably because of >>> >>> the link with the log. O.o >>> >>> >>> I still receive the error in the logs: >>> >>> source4/dsdb/kcc/kcc_periodic.c:790: Failed samba_kcc - >>> >>> NT_STATUS_IO_TIMEOUT >>> >>> >>> The strangest thing is that it occurs when the command is executed: >>> >>> samba-tool dbcheck --cross-ncs --fix --yes >>> >>> >>> Could it be some object causing this error? >>> >>> >>> On Mon, Feb 12, 2024 at 4:40?PM Rowland Penny via samba < >>> >>> samba at lists.samba.org >>> >>> > wrote: >>> >>> >>> On Mon, 12 Feb 2024 16:20:27 -0300 >>> >>> Elias Pereira via samba < >>> >>> samba at lists.samba.org >>> >>> > wrote: >>> >>> >>> hi, >>> >>> >>> My saga continues... >>> >>> >>> I've configured the audit log for drs_repl in smb.conf, and below is >>> >>> the log generated. >>> >>> https://transfer.sh/7fen4qCNIQ/drs_repl.log >>> >>> >>> >>> The log level was 5. >>> >>> drs_repl:5@/var/log/samba/drs_repl.log >>> >>> >>> Could someone take a look and help me understand the log? >>> >>> >>> >>> There is probably nothing wrong with your log, but Firefox doesn't >>> >>> like it, it thinks it contains a virus. >>> >>> >>> Rowland >>> >>> >>> >>> >>> -- >>> >>> To unsubscribe from this list go to the following URL and read the >>> >>> instructions: >>> >>> https://lists.samba.org/mailman/options/samba >>> >>> >>> >>> >>> >>> -- >>> >>> Elias Pereira >>> >>> >>> >>> >>> -- >>> >>> Elias Pereira >>> >>> -- >>> >>> >>> Andrew Bartlett (he/him) https://samba.org/~abartlet/ >>> Samba Team Member (since 2001) https://samba.org >>> Samba Team Lead https://catalyst.net.nz/services/samba >>> Catalyst.Net Ltd >>> >>> Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group >>> company >>> >>> Samba Development and Support: https://catalyst.net.nz/services/samba >>> >>> Catalyst IT - Expert Open Source Solutions >>> >>> >>> >>> >>> -- >>> Elias Pereira >>> >>> -- >>> >>> Andrew Bartlett (he/him) https://samba.org/~abartlet/ >>> Samba Team Member (since 2001) https://samba.org >>> Samba Team Lead https://catalyst.net.nz/services/samba >>> Catalyst.Net Ltd >>> >>> Proudly developing Samba for Catalyst.Net Ltd - a Catalyst IT group >>> company >>> >>> Samba Development and Support: https://catalyst.net.nz/services/samba >>> >>> Catalyst IT - Expert Open Source Solutions >>> >>> >>> >> >> -- >> Elias Pereira >> > > > -- > Elias Pereira >-- Elias Pereira
Reasonably Related Threads
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC
- How to diagnose a busy LDAP server process in the Samba AD DC