Hi, So I did some digging into the source code, and I think I've found the issue. Around line 120 of source3/libads/cldap.c: for (i=0; i<num_servers; i++) { NTSTATUS status; status = cldap_socket_init(state->cldap, NULL, /* local_addr */ state->servers[i], &state->cldap[i]); if (tevent_req_nterror(req, status)) { return tevent_req_post(req, ev); } /* Code omitted for brevity */ } This is in cldap_multi_netlogon_send(), a function that sends CLDAP requests to multiple DCs in one go. The loop here sets up a socket for each DC. cldap_socket_init() in turn (possibly several calls deeper) sets up the UDP socket, and calls connect() on it, which fails with "Network unreachable". This bubbles up the chain and comes back to cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. Note however the return from the function: it returns an error if *any* of the servers queried returned an error, even if any of them succeeded. In my case, even though server 0 (IPv4) succeeds, this call returns an error because server 1 (IPv6) could not be reached. To reiterate, this is in Samba 4.2.10, which ships with Debian 8 (Jessie), and occurs when running "net ads workgroup". This is the relevant section of the D10 log (compare with the strace from my previous email): Adding 2 DC's from auto lookup check_negative_conn_cache returning result 0 for domain FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132 check_negative_conn_cache returning result 0 for domain FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2 remove_duplicate_addrs2: looking for duplicate address/port pairs get_dc_list: returning 2 ip addresses in an ordered list get_dc_list: 192.168.81.132:389 2001:8b0:1627:1::2:389 check_negative_conn_cache returning result 0 for domain FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132 check_negative_conn_cache returning result 0 for domain FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2 ads_try_connect: sending CLDAP request to 2 servers (realm: FEDERATION.STARFLEET-NET.CO.UK) ads_cldap_netlogon: cldap_multi_netlogon failed: NT_STATUS_NETWORK_UNREACHABLE ads_try_connect: CLDAP request failed. Adding cache entry with key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,192.168.81.132] and timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead) add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK (192.168.81.132) to failed conn cache Adding cache entry with key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,2001:8b0:1627:1::2] and timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead) add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK (2001:8b0:1627:1::2) to failed conn cache ads_connect: No logon servers Would this all be better (and/or more actively worked on) if sent to samba-technical ? Regards Rebecca Gellman On 2016-10-14 07:40, L.P.H. van Belle via samba wrote:> Hai, > > Did you check if ifconfig still shows ipv6 adresses. ( even ::1 ) > > Can you check that. > > I have several with ipv6 on and severel only ipv4. > As of 4.1.17+ i didnt see this happing here. Now on 4.4.5 > I think you have forgotten something. > > Greetz, > > Louis > >> -----Oorspronkelijk bericht----- >> Van: samba [mailto:samba-bounces at lists.samba.org] Namens Rebecca Gellman >> via samba >> Verzonden: donderdag 13 oktober 2016 17:07 >> Aan: samba at lists.samba.org >> Onderwerp: [Samba] Bug 6870 resurfaced in Samba 4.2.10 >> >> According to this bugzilla entry, bug 6870 has been fixed as of at least >> version 3.5: >> >> https://bugzilla.samba.org/show_bug.cgi?id=6870 >> >> However, I assert that it is present in 4.2.10, which ships with Debian >> Jessie. >> >> On my home network (IPv4 and IPv6), a box with Samba 4.2.10 with IPv6 >> disabled (via sysctl), will fail to contact a DC because the IPv6 >> connect fails immediately before the v4 connect has a chance to succeed. >> >> I determined this by strace'ing the "net ads workgroup" command, which >> resulted in the following: >> >> 11:41:52 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 11 <0.000027> >> 11:41:52 fcntl(11, F_GETFL) = 0x2 (flags O_RDWR) <0.000015> >> 11:41:52 fcntl(11, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000016> >> 11:41:52 fcntl(11, F_GETFD) = 0 <0.000015> >> 11:41:52 fcntl(11, F_SETFD, FD_CLOEXEC) = 0 <0.000016> >> 11:41:52 connect(11, {sa_family=AF_INET, sin_port=htons(389), >> sin_addr=inet_addr("192.168.81.132")}, 16) = 0 <0.000050> >> 11:41:52 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 12 <0.000025> >> 11:41:52 fcntl(12, F_GETFL) = 0x2 (flags O_RDWR) <0.000016> >> 11:41:52 fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000015> >> 11:41:52 fcntl(12, F_GETFD) = 0 <0.000016> >> 11:41:52 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 <0.000015> >> 11:41:52 setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0 <0.000018> >> 11:41:52 connect(12, {sa_family=AF_INET6, sin6_port=htons(389), >> inet_pton(AF_INET6, "2001:8b0:1627:1::2", &sin6_addr), sin6_flowinfo=0, >> sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable) >> <0.000032> >> 11:41:52 close(12) = 0 <0.000028> >> 11:41:52 close(11) = 0 <0.000024> >> 11:41:52 close(10) = 0 <0.000020> >> 11:41:52 fcntl(8, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=288, >> len=1}) = 0 <0.000021> >> 11:41:52 fcntl(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=288, >> len=1}) = 0 <0.000018> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48, >> len=1}) = 0 <0.000019> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48, >> len=1}) = 0 <0.000032> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48, >> len=1}) = 0 <0.000019> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48, >> len=1}) = 0 <0.000017> >> 11:41:52 write(2, "ads_connect: No logon servers\n", 30ads_connect: No >> logon servers >> >> As you can see, sockets 11 and 12 are setup to contact the DC, 11 to v4, >> and 12 to v6. connect() on socket 11 is successful (returns 0), but >> connect() on socket 12 returns -1 due to "Network unreachable" - this is >> correct as the box in question does not have IPv6. >> >> The attempt is abandoned (implied by the immediate closing of sockets 11 >> and 12, and the writing of "No logon servers" to stderr) before any >> attempt is made to talk on socket 11 (v4). >> >> After futzing the box to have an IPv6 address with appropriate routing, >> the attempt succeeds as expected. However, for reasons (too long to go >> into here) this is not a solution, only a means of proving the problem. >> >> Since most DCs publish v6 records of some kind in DNS in an AD setup >> these days, it would seem that this behaviour could do with urgently >> fixing. >> >> Any comments from the samba bods, or should I forward this on to >> samba-technical ? >> >> Thanks >> >> -- Rebecca Gellman >> >> -- >> To unsubscribe from this list go to the following URL and read the >> instructions: https://lists.samba.org/mailman/options/samba
On Mon, Oct 17, 2016 at 05:13:08PM +0100, Rebecca Gellman via samba wrote:> > > Hi, > > So I did some digging into the source code, and I think I've found the > issue. Around line 120 of source3/libads/cldap.c: > > for (i=0; i<num_servers; i++) { > NTSTATUS status; > > status = cldap_socket_init(state->cldap, > NULL, /* local_addr */ > state->servers[i], > &state->cldap[i]); > > if (tevent_req_nterror(req, status)) { > return tevent_req_post(req, ev); > } > > /* Code omitted for brevity */ > > } > > This is in cldap_multi_netlogon_send(), a function that sends CLDAP > requests to multiple DCs in one go. The loop here sets up a socket for > each DC. cldap_socket_init() in turn (possibly several calls deeper) > sets up the UDP socket, and calls connect() on it, which fails with > "Network unreachable". This bubbles up the chain and comes back to > cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. > > Note however the return from the function: it returns an error if *any* > of the servers queried returned an error, even if any of them succeeded.Great analysis - thanks ! I'll look into a patch for this. We'll need a new bug report for this one.> In my case, even though server 0 (IPv4) succeeds, this call returns an > error because server 1 (IPv6) could not be reached. > > To reiterate, this is in Samba 4.2.10, which ships with Debian 8 > (Jessie), and occurs when running "net ads workgroup". > > This is the relevant section of the D10 log (compare with the strace > from my previous email): > > Adding 2 DC's from auto lookup > check_negative_conn_cache returning result 0 for domain > FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132 > check_negative_conn_cache returning result 0 for domain > FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2 > remove_duplicate_addrs2: looking for duplicate address/port pairs > get_dc_list: returning 2 ip addresses in an ordered list > get_dc_list: 192.168.81.132:389 2001:8b0:1627:1::2:389 > check_negative_conn_cache returning result 0 for domain > FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132 > check_negative_conn_cache returning result 0 for domain > FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2 > ads_try_connect: sending CLDAP request to 2 servers (realm: > FEDERATION.STARFLEET-NET.CO.UK) > ads_cldap_netlogon: cldap_multi_netlogon failed: > NT_STATUS_NETWORK_UNREACHABLE > ads_try_connect: CLDAP request failed. > Adding cache entry with > key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,192.168.81.132] and > timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead) > add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK > (192.168.81.132) to failed conn cache > Adding cache entry with > key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,2001:8b0:1627:1::2] > and timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead) > add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK > (2001:8b0:1627:1::2) to failed conn cache > ads_connect: No logon servers > > Would this all be better (and/or more actively worked on) if sent to > samba-technical ? > > Regards > > Rebecca Gellman > > On 2016-10-14 07:40, L.P.H. van Belle via samba wrote: > > > Hai, > > > > Did you check if ifconfig still shows ipv6 adresses. ( even ::1 ) > > > > Can you check that. > > > > I have several with ipv6 on and severel only ipv4. > > As of 4.1.17+ i didnt see this happing here. Now on 4.4.5 > > I think you have forgotten something. > > > > Greetz, > > > > Louis > > > >> -----Oorspronkelijk bericht----- > >> Van: samba [mailto:samba-bounces at lists.samba.org] Namens Rebecca Gellman > >> via samba > >> Verzonden: donderdag 13 oktober 2016 17:07 > >> Aan: samba at lists.samba.org > >> Onderwerp: [Samba] Bug 6870 resurfaced in Samba 4.2.10 > >> > >> According to this bugzilla entry, bug 6870 has been fixed as of at least > >> version 3.5: > >> > >> https://bugzilla.samba.org/show_bug.cgi?id=6870 > >> > >> However, I assert that it is present in 4.2.10, which ships with Debian > >> Jessie. > >> > >> On my home network (IPv4 and IPv6), a box with Samba 4.2.10 with IPv6 > >> disabled (via sysctl), will fail to contact a DC because the IPv6 > >> connect fails immediately before the v4 connect has a chance to succeed. > >> > >> I determined this by strace'ing the "net ads workgroup" command, which > >> resulted in the following: > >> > >> 11:41:52 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 11 <0.000027> > >> 11:41:52 fcntl(11, F_GETFL) = 0x2 (flags O_RDWR) <0.000015> > >> 11:41:52 fcntl(11, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000016> > >> 11:41:52 fcntl(11, F_GETFD) = 0 <0.000015> > >> 11:41:52 fcntl(11, F_SETFD, FD_CLOEXEC) = 0 <0.000016> > >> 11:41:52 connect(11, {sa_family=AF_INET, sin_port=htons(389), > >> sin_addr=inet_addr("192.168.81.132")}, 16) = 0 <0.000050> > >> 11:41:52 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 12 <0.000025> > >> 11:41:52 fcntl(12, F_GETFL) = 0x2 (flags O_RDWR) <0.000016> > >> 11:41:52 fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000015> > >> 11:41:52 fcntl(12, F_GETFD) = 0 <0.000016> > >> 11:41:52 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 <0.000015> > >> 11:41:52 setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0 <0.000018> > >> 11:41:52 connect(12, {sa_family=AF_INET6, sin6_port=htons(389), > >> inet_pton(AF_INET6, "2001:8b0:1627:1::2", &sin6_addr), sin6_flowinfo=0, > >> sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable) > >> <0.000032> > >> 11:41:52 close(12) = 0 <0.000028> > >> 11:41:52 close(11) = 0 <0.000024> > >> 11:41:52 close(10) = 0 <0.000020> > >> 11:41:52 fcntl(8, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=288, > >> len=1}) = 0 <0.000021> > >> 11:41:52 fcntl(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=288, > >> len=1}) = 0 <0.000018> > >> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48, > >> len=1}) = 0 <0.000019> > >> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48, > >> len=1}) = 0 <0.000032> > >> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48, > >> len=1}) = 0 <0.000019> > >> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48, > >> len=1}) = 0 <0.000017> > >> 11:41:52 write(2, "ads_connect: No logon servers\n", 30ads_connect: No > >> logon servers > >> > >> As you can see, sockets 11 and 12 are setup to contact the DC, 11 to v4, > >> and 12 to v6. connect() on socket 11 is successful (returns 0), but > >> connect() on socket 12 returns -1 due to "Network unreachable" - this is > >> correct as the box in question does not have IPv6. > >> > >> The attempt is abandoned (implied by the immediate closing of sockets 11 > >> and 12, and the writing of "No logon servers" to stderr) before any > >> attempt is made to talk on socket 11 (v4). > >> > >> After futzing the box to have an IPv6 address with appropriate routing, > >> the attempt succeeds as expected. However, for reasons (too long to go > >> into here) this is not a solution, only a means of proving the problem. > >> > >> Since most DCs publish v6 records of some kind in DNS in an AD setup > >> these days, it would seem that this behaviour could do with urgently > >> fixing. > >> > >> Any comments from the samba bods, or should I forward this on to > >> samba-technical ? > >> > >> Thanks > >> > >> -- Rebecca Gellman > >> > >> -- > >> To unsubscribe from this list go to the following URL and read the > >> instructions: https://lists.samba.org/mailman/options/samba > > > -- > To unsubscribe from this list go to the following URL and read the > instructions: https://lists.samba.org/mailman/options/samba
On Mon, Oct 17, 2016 at 09:41:10AM -0700, Jeremy Allison via samba wrote:> On Mon, Oct 17, 2016 at 05:13:08PM +0100, Rebecca Gellman via samba wrote: > > > > > > Hi, > > > > So I did some digging into the source code, and I think I've found the > > issue. Around line 120 of source3/libads/cldap.c: > > > > for (i=0; i<num_servers; i++) { > > NTSTATUS status; > > > > status = cldap_socket_init(state->cldap, > > NULL, /* local_addr */ > > state->servers[i], > > &state->cldap[i]); > > > > if (tevent_req_nterror(req, status)) { > > return tevent_req_post(req, ev); > > } > > > > /* Code omitted for brevity */ > > > > } > > > > This is in cldap_multi_netlogon_send(), a function that sends CLDAP > > requests to multiple DCs in one go. The loop here sets up a socket for > > each DC. cldap_socket_init() in turn (possibly several calls deeper) > > sets up the UDP socket, and calls connect() on it, which fails with > > "Network unreachable". This bubbles up the chain and comes back to > > cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. > > > > Note however the return from the function: it returns an error if *any* > > of the servers queried returned an error, even if any of them succeeded. > > Great analysis - thanks ! I'll look into a patch for this. > > We'll need a new bug report for this one.OK, here's the new bug: https://bugzilla.samba.org/show_bug.cgi?id=12381 and here is (I think) the patch. Can you test this and let me know if it fixes your test case ? CC:ing to samba-technical for followups. Cheers, Jeremy.