hello our AD domain is hosted by two samba AD domain controllers version 4.12.6 - replication between controllers is fine, no problems. - no schema errors. - no database errors, all fine. - no CPU utilizations - wthout noticeable bandwidth utilization Recently we have deployed Azure AD connector on dedicated windows system (system is domain member server). since this deployment we are observing following issues on DCs: - CPU utilization issue (one CPU core fully utilized) - high BW utilization - replication issue messages: [2020/10/21 17:41:55.043563,? 0] ../../source4/rpc_server/drsuapi/getncchanges.c:2910(dcesrv_drsuapi_DsGetNCChanges) ? ../../source4/rpc_server/drsuapi/getncchanges.c:2910: DsGetNCChanges 2nd replication on DN DC= older highwatermark (last_dn CN=userXYZ,OU=Users,DC=) and this is happening only on one DC server in time - the one, to which this AD connector is connected for doing AD to AAD sync tasks. More details: CPU: mostly only one CPU core from all system-assigned cores is utilized at 100%: BW utilization: you can see example here (peak starts once the Azure AD connector connects to particular DC server) (notice the "uploaded" data - 54GB - value from DC system): Replicaton errors: repeating messages (example above) every each 4-5 seconds. the "last_dn" is changing during time slowly: it is changed to another (user) object each several hours. no other issues observed. - If we deactivate this Azure connector, all issues stopped (but of course we are out of sync with AAD) - if we reboot/stop DC1 services (serving for Azure connector), the Azure connector switch to DC2 and same story happen again (CPU/bandwidth/replication logs) I've found similar issue reported back in 2017: https://lists.samba.org/archive/samba/2017-October/211756.html ([Samba] samba getting stuck, highwatermark replication issue?) seems this issue is still in place now. no difference. does anyone else have similar issues? does anyone else how to resolve them? either on Azure AD connector side (there are various confiuration option available) or (possibly) on samba side? thank you michal
ups, seems pictures (attachments in general) are not accepted here, screen (graph) is available here: https://i.postimg.cc/xCk6k038/image-2020-10-21-190940.png On 10/21/2020 6:00 PM, Michal Bruncko wrote:> hello > > our AD domain is hosted by two samba AD domain controllers version 4.12.6 > - replication between controllers is fine, no problems. > - no schema errors. > - no database errors, all fine. > - no CPU utilizations > - wthout noticeable bandwidth utilization > > Recently we have deployed Azure AD connector on dedicated windows > system (system is domain member server). since this deployment we are > observing following issues on DCs: > - CPU utilization issue (one CPU core fully utilized) > - high BW utilization > - replication issue messages: > [2020/10/21 17:41:55.043563,? 0] > ../../source4/rpc_server/drsuapi/getncchanges.c:2910(dcesrv_drsuapi_DsGetNCChanges) > ? ../../source4/rpc_server/drsuapi/getncchanges.c:2910: DsGetNCChanges > 2nd replication on DN DC= older highwatermark (last_dn > CN=userXYZ,OU=Users,DC=) > > > and this is happening only on one DC server in time - the one, to > which this AD connector is connected for doing AD to AAD sync tasks. > > More details: > > CPU: mostly only one CPU core from all system-assigned cores is > utilized at 100%: > > > BW utilization: you can see example here (peak starts once the Azure > AD connector connects to particular DC server) (notice the "uploaded" > data - 54GB - value from DC system): > > > > Replicaton errors: repeating messages (example above) every each 4-5 > seconds. the "last_dn" is changing during time slowly: it is changed > to another (user) object each several hours. > > no other issues observed. > > - If we deactivate this Azure connector, all issues stopped (but of > course we are out of sync with AAD) > - if we reboot/stop DC1 services (serving for Azure connector), the > Azure connector switch to DC2 and same story happen again > (CPU/bandwidth/replication logs) > > I've found similar issue reported back in 2017: > https://lists.samba.org/archive/samba/2017-October/211756.html > ([Samba] samba getting stuck, highwatermark replication issue?) > > seems this issue is still in place now. no difference. > > > does anyone else have similar issues? does anyone else how to resolve > them? either on Azure AD connector side (there are various > confiuration option available) or (possibly) on samba side? > > > thank you > michal >
Just quickly, without looking into the whole issue, but this error message (apart from your redaction) is quite normal and I need to suppress it. It happens if a client backs off and restarts replication with new flags, such as GET_ANC because it got a child object before the parent (for example). On Wed, 2020-10-21 at 18:00 +0200, Michal Bruncko via samba wrote:> [2020/10/21 17:41:55.043563, 0] > > ../../source4/rpc_server/drsuapi/getncchanges.c:2910(dcesrv_drsuapi_D > sGetNCChanges) > > ../../source4/rpc_server/drsuapi/getncchanges.c:2910: > DsGetNCChanges > > 2nd replication on DN DC= older highwatermark (last_dn > > CN=userXYZ,OU=Users,DC=)-- Andrew Bartlett https://samba.org/~abartlet/ Authentication Developer, Samba Team https://samba.org Samba Developer, Catalyst IT https://catalyst.net.nz/services/samba
Hi Michal, Seems we are doing similar things at the moment: getting samba to work with azure AD. We also see the high CPU usage on the DC that the Azure AD Connect server connected to. Between 70 - 100 percent in our case. We are not seeing any replication issues after azure AD Connect, and I have a script that automatically checks replication every few minutes. I was the one reporting the highwatermark back in 2017, but we're not getting those now. Between 2017 and now, we stopped testing azure, and I took it up again only last week. On the samba DC side, we observe that samba is very buzy talking to the Azure AD Connect machine, even though we filter sync using one group containing only three members, and none of them are actually changing. (this is just for testing now) How many users are you syncing? Are you using pass-through authentication, or syncing password hashes? And is what you chose working for you, authentication-wise? Regards, MJ On 10/21/20 7:10 PM, Michal Bruncko via samba wrote:> ups, seems pictures (attachments in general) are not accepted here, > screen (graph) is available here: > https://i.postimg.cc/xCk6k038/image-2020-10-21-190940.png > > On 10/21/2020 6:00 PM, Michal Bruncko wrote: >> hello >> >> our AD domain is hosted by two samba AD domain controllers version 4.12.6 >> - replication between controllers is fine, no problems. >> - no schema errors. >> - no database errors, all fine. >> - no CPU utilizations >> - wthout noticeable bandwidth utilization >> >> Recently we have deployed Azure AD connector on dedicated windows >> system (system is domain member server). since this deployment we are >> observing following issues on DCs: >> - CPU utilization issue (one CPU core fully utilized) >> - high BW utilization >> - replication issue messages: >> [2020/10/21 17:41:55.043563,? 0] >> ../../source4/rpc_server/drsuapi/getncchanges.c:2910(dcesrv_drsuapi_DsGetNCChanges) >> >> ? ../../source4/rpc_server/drsuapi/getncchanges.c:2910: DsGetNCChanges >> 2nd replication on DN DC= older highwatermark (last_dn >> CN=userXYZ,OU=Users,DC=) >> >> >> and this is happening only on one DC server in time - the one, to >> which this AD connector is connected for doing AD to AAD sync tasks. >> >> More details: >> >> CPU: mostly only one CPU core from all system-assigned cores is >> utilized at 100%: >> >> >> BW utilization: you can see example here (peak starts once the Azure >> AD connector connects to particular DC server) (notice the "uploaded" >> data - 54GB - value from DC system): >> >> >> >> Replicaton errors: repeating messages (example above) every each 4-5 >> seconds. the "last_dn" is changing during time slowly: it is changed >> to another (user) object each several hours. >> >> no other issues observed. >> >> - If we deactivate this Azure connector, all issues stopped (but of >> course we are out of sync with AAD) >> - if we reboot/stop DC1 services (serving for Azure connector), the >> Azure connector switch to DC2 and same story happen again >> (CPU/bandwidth/replication logs) >> >> I've found similar issue reported back in 2017: >> https://lists.samba.org/archive/samba/2017-October/211756.html >> ([Samba] samba getting stuck, highwatermark replication issue?) >> >> seems this issue is still in place now. no difference. >> >> >> does anyone else have similar issues? does anyone else how to resolve >> them? either on Azure AD connector side (there are various >> confiuration option available) or (possibly) on samba side? >> >> >> thank you >> michal >> >