Dale Renton
2016-Dec-16 20:54 UTC
[Samba] Replication with Multiple Sites in a Hub and Spoke Topology
Samba 4.5.1 (started with this version as well) I started with 3 domain controllers, DC1 and DC2 at the hub, and another, DC3 as a spoke. Everything was running smoothly with this config. No problems with 'samba-tool drs showrepl'. As soon as I started adding more 'spoke' domain controllers I'm getting timeouts in the 'samba-tool drs showrepl' command. I believe the problem is arising because the spokes cannot ping/see one another. After running a tcpdump I do indeed see spokes trying to communicate. I have site links created for each spoke. So in Active Directory Sites and Services there is a site link and subnet for each spoke and the hub. I'm also having issues with 'samba-tool domain join ad.example.com DC' timeouts, but if I keep trying, it eventually works. At the moment I have 5 domain controllers with plans on adding more. samba-tool drs showrepl works fine on both hub domain controllers, DC1 and DC2 samba-tool dbcheck ( works fine on all DCs ) Checking 856 objects Checked 856 objects (0 errors) samba-tool drs showrepl on DC3 spoke returns : Failed to bind to uuid e3514235-4b06-11d1-ab04-00c04fc2dcd2 for ncacn_ip_tcp:192.168.2.1[1024,seal,target_hostname=dc3.ad.example.com,abstract_syntax=e3514235-4b06-11d1-ab04-00c04fc2dcd2/0x00000004,localaddress=192.168.2.1] NT_STATUS_IO_TIMEOUT ERROR(<class 'samba.drs_utils.drsException'>): DRS connection to dc3.ad.example.com failed - drsException: DRS connection to dc3.ad.example.com failed: (-1073741643, '{Device Timeout} The specified I/O operation on %hs was not completed before the time-out period expired.') File "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/drs.py", line 41, in drsuapi_connect (ctx.drsuapi, ctx.drsuapi_handle, ctx.bind_supported_extensions) drs_utils.drsuapi_connect(ctx.server, ctx.lp, ctx.creds) File "/usr/local/samba/lib64/python2.7/site-packages/samba/drs_utils.py", line 54, in drsuapi_connect raise drsException("DRS connection to %s failed: %s" % (server, e)) smb.conf : [global] netbios name = DC3 realm = AD.EXAMPLE.COM server services = s3fs, rpc, nbt, wrepl, ldap, cldap, kdc, drepl, winbindd, ntp_signd, kcc, dnsupdate workgroup = EXAMPLE server role = active directory domain controller idmap_ldb:use rfc2307 = yes [netlogon] path = /usr/local/samba/var/locks/sysvol/ad.example.com/scripts read only = No [sysvol] path = /usr/local/samba/var/locks/sysvol read only = No I'm assuming this is an issue with samba_kcc but I'm not sure what steps to take next. Thanks, Dale
Garming Sam
2016-Dec-18 21:20 UTC
[Samba] Replication with Multiple Sites in a Hub and Spoke Topology
Hi, It seems unlikely that the KCC is the cause of these issues. The KCC is only responsible for telling who to connect (and when) and doesn't actually affect any underlying network connectivity. Connectivity between the spokes should not be required and the communication between them is usually just some stale data. But none of that should affect either of these commands. Unless the DRS server is particular busy, it points to actual connectivity issues. If you're running samba-tool drs showrepl, it looks like it should only contact the DC you are on. How long does it take before each of the commands bail out? When doing the domain join, do you pick a particular server (and/or IP) to run against, and does it make a difference? Cheers, Garming On 17/12/16 09:54, Dale Renton via samba wrote:> Samba 4.5.1 (started with this version as well) > > I started with 3 domain controllers, DC1 and DC2 at the hub, and another, > DC3 as a spoke. Everything was running smoothly with this config. No > problems with 'samba-tool drs showrepl'. As soon as I started adding more > 'spoke' domain controllers I'm getting timeouts in the 'samba-tool drs > showrepl' command. I believe the problem is arising because the spokes > cannot ping/see one another. After running a tcpdump I do indeed see > spokes trying to communicate. I have site links created for each spoke. > So in Active Directory Sites and Services there is a site link and subnet > for each spoke and the hub. I'm also having issues with 'samba-tool domain > join ad.example.com DC' timeouts, but if I keep trying, it eventually > works. At the moment I have 5 domain controllers with plans on adding more. > > samba-tool drs showrepl works fine on both hub domain controllers, DC1 and > DC2 > > samba-tool dbcheck ( works fine on all DCs ) > Checking 856 objects > Checked 856 objects (0 errors) > > samba-tool drs showrepl on DC3 spoke returns : > > Failed to bind to uuid e3514235-4b06-11d1-ab04-00c04fc2dcd2 for > ncacn_ip_tcp:192.168.2.1[1024,seal,target_hostname=dc3.ad.example.com,abstract_syntax=e3514235-4b06-11d1-ab04-00c04fc2dcd2/0x00000004,localaddress=192.168.2.1] > NT_STATUS_IO_TIMEOUT > ERROR(<class 'samba.drs_utils.drsException'>): DRS connection to > dc3.ad.example.com failed - drsException: DRS connection to > dc3.ad.example.com failed: (-1073741643, '{Device Timeout} The specified > I/O operation on %hs was not completed before the time-out period expired.') > File > "/usr/local/samba/lib64/python2.7/site-packages/samba/netcmd/drs.py", line > 41, in drsuapi_connect > (ctx.drsuapi, ctx.drsuapi_handle, ctx.bind_supported_extensions) > drs_utils.drsuapi_connect(ctx.server, ctx.lp, ctx.creds) > File > "/usr/local/samba/lib64/python2.7/site-packages/samba/drs_utils.py", line > 54, in drsuapi_connect > raise drsException("DRS connection to %s failed: %s" % (server, e)) > > > smb.conf : > > [global] > netbios name = DC3 > realm = AD.EXAMPLE.COM > server services = s3fs, rpc, nbt, wrepl, ldap, cldap, kdc, > drepl, winbindd, ntp_signd, kcc, dnsupdate > workgroup = EXAMPLE > server role = active directory domain controller > idmap_ldb:use rfc2307 = yes > > [netlogon] > path = /usr/local/samba/var/locks/sysvol/ad.example.com/scripts > read only = No > > [sysvol] > path = /usr/local/samba/var/locks/sysvol > read only = No > > > I'm assuming this is an issue with samba_kcc but I'm not sure what steps to > take next. > > Thanks, > Dale
Dale Renton
2016-Dec-19 18:42 UTC
[Samba] Replication with Multiple Sites in a Hub and Spoke Topology
On Sun, Dec 18, 2016 at 5:20 PM, Garming Sam <garming at catalyst.net.nz> wrote:> Hi, > > It seems unlikely that the KCC is the cause of these issues. The KCC is > only responsible for telling who to connect (and when) and doesn't > actually affect any underlying network connectivity. Connectivity > between the spokes should not be required and the communication between > them is usually just some stale data. But none of that should affect > either of these commands. > > Unless the DRS server is particular busy, it points to actual > connectivity issues. If you're running samba-tool drs showrepl, it looks > like it should only contact the DC you are on. How long does it take > before each of the commands bail out? When doing the domain join, do you > pick a particular server (and/or IP) to run against, and does it make a > difference? > >I figured out the problem. I ran the strace command on 'samba-tool drs showrepl' and indeed it did show one spoke trying to communicate with another spoke. This is where the command would hang for 2 minutes and return the NT_STATUS_IO_TIMEOUT. I changed the krb5.conf on DC3 only (left the hub domain controllers as is) from : [libdefaults] default_realm = AD.EXAMPLE.COM dns_lookup_realm = false dns_lookup_kdc = true to [libdefaults] default_realm = AD.EXAMPLE.COM dns_lookup_realm = false dns_lookup_kdc = false [realms] AD.EXAMPLE.COM = { kdc = DC3.AD.EXAMPLE.COM admin_server = DC3.AD.EXAMPLE.COM default_domain = AD.EXAMPLE.COM Now everything seems to be working again. The domain join worked great too. I'm assuming there is no harm in making this change? Thanks, Dale