thr3ads.net - samba - [Samba] Bug 6870 resurfaced in Samba 4.2.10 [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Rebecca Gellman

2016-Oct-17 16:13 UTC

[Samba] Bug 6870 resurfaced in Samba 4.2.10

Hi, 

So I did some digging into the source code, and I think I've found the
issue. Around line 120 of source3/libads/cldap.c: 

for (i=0; i<num_servers; i++) {
  NTSTATUS status; 

  status = cldap_socket_init(state->cldap,
    NULL, /* local_addr */
    state->servers[i],
    &state->cldap[i]); 

  if (tevent_req_nterror(req, status)) {
    return tevent_req_post(req, ev);
  } 

  /* Code omitted for brevity */ 

} 

This is in cldap_multi_netlogon_send(), a function that sends CLDAP
requests to multiple DCs in one go. The loop here sets up a socket for
each DC. cldap_socket_init() in turn (possibly several calls deeper)
sets up the UDP socket, and calls connect() on it, which fails with
"Network unreachable". This bubbles up the chain and comes back to
cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. 

Note however the return from the function: it returns an error if *any*
of the servers queried returned an error, even if any of them succeeded.

In my case, even though server 0 (IPv4) succeeds, this call returns an
error because server 1 (IPv6) could not be reached. 

To reiterate, this is in Samba 4.2.10, which ships with Debian 8
(Jessie), and occurs when running "net ads workgroup". 

This is the relevant section of the D10 log (compare with the strace
from my previous email): 

Adding 2 DC's from auto lookup
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2
remove_duplicate_addrs2: looking for duplicate address/port pairs
get_dc_list: returning 2 ip addresses in an ordered list
get_dc_list: 192.168.81.132:389 2001:8b0:1627:1::2:389 
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132
check_negative_conn_cache returning result 0 for domain
FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2
ads_try_connect: sending CLDAP request to 2 servers (realm:
FEDERATION.STARFLEET-NET.CO.UK)
ads_cldap_netlogon: cldap_multi_netlogon failed:
NT_STATUS_NETWORK_UNREACHABLE
ads_try_connect: CLDAP request failed.
Adding cache entry with
key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,192.168.81.132] and
timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead)
add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK
(192.168.81.132) to failed conn cache
Adding cache entry with
key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,2001:8b0:1627:1::2]
and timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead)
add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK
(2001:8b0:1627:1::2) to failed conn cache
ads_connect: No logon servers 

Would this all be better (and/or more actively worked on) if sent to
samba-technical ? 

Regards 

Rebecca Gellman 

On 2016-10-14 07:40, L.P.H. van Belle via samba wrote: 
> Hai, 
> 
> Did you check if ifconfig still shows ipv6 adresses. ( even ::1 ) 
> 
> Can you check that. 
> 
> I have several with ipv6 on and severel only ipv4. 
> As of 4.1.17+ i didnt see this happing here. Now on 4.4.5 
> I think you have forgotten something. 
> 
> Greetz, 
> 
> Louis
> 
>> -----Oorspronkelijk bericht-----
>> Van: samba [mailto:samba-bounces at lists.samba.org] Namens Rebecca
Gellman
>> via samba
>> Verzonden: donderdag 13 oktober 2016 17:07
>> Aan: samba at lists.samba.org
>> Onderwerp: [Samba] Bug 6870 resurfaced in Samba 4.2.10
>> 
>> According to this bugzilla entry, bug 6870 has been fixed as of at
least
>> version 3.5:
>> 
>> https://bugzilla.samba.org/show_bug.cgi?id=6870
>> 
>> However, I assert that it is present in 4.2.10, which ships with Debian
>> Jessie.
>> 
>> On my home network (IPv4 and IPv6), a box with Samba 4.2.10 with IPv6
>> disabled (via sysctl), will fail to contact a DC because the IPv6
>> connect fails immediately before the v4 connect has a chance to
succeed.
>> 
>> I determined this by strace'ing the "net ads workgroup"
command, which
>> resulted in the following:
>> 
>> 11:41:52 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 11 <0.000027>
>> 11:41:52 fcntl(11, F_GETFL) = 0x2 (flags O_RDWR) <0.000015>
>> 11:41:52 fcntl(11, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000016>
>> 11:41:52 fcntl(11, F_GETFD) = 0 <0.000015>
>> 11:41:52 fcntl(11, F_SETFD, FD_CLOEXEC) = 0 <0.000016>
>> 11:41:52 connect(11, {sa_family=AF_INET, sin_port=htons(389),
>> sin_addr=inet_addr("192.168.81.132")}, 16) = 0
<0.000050>
>> 11:41:52 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 12 <0.000025>
>> 11:41:52 fcntl(12, F_GETFL) = 0x2 (flags O_RDWR) <0.000016>
>> 11:41:52 fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000015>
>> 11:41:52 fcntl(12, F_GETFD) = 0 <0.000016>
>> 11:41:52 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 <0.000015>
>> 11:41:52 setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
<0.000018>
>> 11:41:52 connect(12, {sa_family=AF_INET6, sin6_port=htons(389),
>> inet_pton(AF_INET6, "2001:8b0:1627:1::2", &sin6_addr),
sin6_flowinfo=0,
>> sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
>> <0.000032>
>> 11:41:52 close(12) = 0 <0.000028>
>> 11:41:52 close(11) = 0 <0.000024>
>> 11:41:52 close(10) = 0 <0.000020>
>> 11:41:52 fcntl(8, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET, start=288,
>> len=1}) = 0 <0.000021>
>> 11:41:52 fcntl(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=288,
>> len=1}) = 0 <0.000018>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000019>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000032>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000019>
>> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET, start=48,
>> len=1}) = 0 <0.000017>
>> 11:41:52 write(2, "ads_connect: No logon servers\n",
30ads_connect: No
>> logon servers
>> 
>> As you can see, sockets 11 and 12 are setup to contact the DC, 11 to
v4,
>> and 12 to v6. connect() on socket 11 is successful (returns 0), but
>> connect() on socket 12 returns -1 due to "Network
unreachable" - this is
>> correct as the box in question does not have IPv6.
>> 
>> The attempt is abandoned (implied by the immediate closing of sockets
11
>> and 12, and the writing of "No logon servers" to stderr)
before any
>> attempt is made to talk on socket 11 (v4).
>> 
>> After futzing the box to have an IPv6 address with appropriate routing,
>> the attempt succeeds as expected. However, for reasons (too long to go
>> into here) this is not a solution, only a means of proving the problem.
>> 
>> Since most DCs publish v6 records of some kind in DNS in an AD setup
>> these days, it would seem that this behaviour could do with urgently
>> fixing.
>> 
>> Any comments from the samba bods, or should I forward this on to
>> samba-technical ?
>> 
>> Thanks
>> 
>> -- Rebecca Gellman
>> 
>> --
>> To unsubscribe from this list go to the following URL and read the
>> instructions:  https://lists.samba.org/mailman/options/samba

Jeremy Allison

2016-Oct-17 16:41 UTC

head link

[Samba] Bug 6870 resurfaced in Samba 4.2.10

On Mon, Oct 17, 2016 at 05:13:08PM +0100, Rebecca Gellman via samba
wrote:>  
> 
> Hi, 
> 
> So I did some digging into the source code, and I think I've found the
> issue. Around line 120 of source3/libads/cldap.c: 
> 
> for (i=0; i<num_servers; i++) {
>   NTSTATUS status; 
> 
>   status = cldap_socket_init(state->cldap,
>     NULL, /* local_addr */
>     state->servers[i],
>     &state->cldap[i]); 
> 
>   if (tevent_req_nterror(req, status)) {
>     return tevent_req_post(req, ev);
>   } 
> 
>   /* Code omitted for brevity */ 
> 
> } 
> 
> This is in cldap_multi_netlogon_send(), a function that sends CLDAP
> requests to multiple DCs in one go. The loop here sets up a socket for
> each DC. cldap_socket_init() in turn (possibly several calls deeper)
> sets up the UDP socket, and calls connect() on it, which fails with
> "Network unreachable". This bubbles up the chain and comes back
to
> cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. 
> 
> Note however the return from the function: it returns an error if *any*
> of the servers queried returned an error, even if any of them succeeded.
Great analysis - thanks ! I'll look into a patch for this.

We'll need a new bug report for this one.
> In my case, even though server 0 (IPv4) succeeds, this call returns an
> error because server 1 (IPv6) could not be reached. 
> 
> To reiterate, this is in Samba 4.2.10, which ships with Debian 8
> (Jessie), and occurs when running "net ads workgroup". 
> 
> This is the relevant section of the D10 log (compare with the strace
> from my previous email): 
> 
> Adding 2 DC's from auto lookup
> check_negative_conn_cache returning result 0 for domain
> FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132
> check_negative_conn_cache returning result 0 for domain
> FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2
> remove_duplicate_addrs2: looking for duplicate address/port pairs
> get_dc_list: returning 2 ip addresses in an ordered list
> get_dc_list: 192.168.81.132:389 2001:8b0:1627:1::2:389 
> check_negative_conn_cache returning result 0 for domain
> FEDERATION.STARFLEET-NET.CO.UK server 192.168.81.132
> check_negative_conn_cache returning result 0 for domain
> FEDERATION.STARFLEET-NET.CO.UK server 2001:8b0:1627:1::2
> ads_try_connect: sending CLDAP request to 2 servers (realm:
> FEDERATION.STARFLEET-NET.CO.UK)
> ads_cldap_netlogon: cldap_multi_netlogon failed:
> NT_STATUS_NETWORK_UNREACHABLE
> ads_try_connect: CLDAP request failed.
> Adding cache entry with
> key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,192.168.81.132] and
> timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead)
> add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK
> (192.168.81.132) to failed conn cache
> Adding cache entry with
> key=[NEG_CONN_CACHE/FEDERATION.STARFLEET-NET.CO.UK,2001:8b0:1627:1::2]
> and timeout=[Mon Oct 17 16:51:46 2016 BST] (60 seconds ahead)
> add_failed_connection_entry: added domain FEDERATION.STARFLEET-NET.CO.UK
> (2001:8b0:1627:1::2) to failed conn cache
> ads_connect: No logon servers 
> 
> Would this all be better (and/or more actively worked on) if sent to
> samba-technical ? 
> 
> Regards 
> 
> Rebecca Gellman 
> 
> On 2016-10-14 07:40, L.P.H. van Belle via samba wrote: 
> 
> > Hai, 
> > 
> > Did you check if ifconfig still shows ipv6 adresses. ( even ::1 ) 
> > 
> > Can you check that. 
> > 
> > I have several with ipv6 on and severel only ipv4. 
> > As of 4.1.17+ i didnt see this happing here. Now on 4.4.5 
> > I think you have forgotten something. 
> > 
> > Greetz, 
> > 
> > Louis
> > 
> >> -----Oorspronkelijk bericht-----
> >> Van: samba [mailto:samba-bounces at lists.samba.org] Namens
Rebecca Gellman
> >> via samba
> >> Verzonden: donderdag 13 oktober 2016 17:07
> >> Aan: samba at lists.samba.org
> >> Onderwerp: [Samba] Bug 6870 resurfaced in Samba 4.2.10
> >> 
> >> According to this bugzilla entry, bug 6870 has been fixed as of at
least
> >> version 3.5:
> >> 
> >> https://bugzilla.samba.org/show_bug.cgi?id=6870
> >> 
> >> However, I assert that it is present in 4.2.10, which ships with
Debian
> >> Jessie.
> >> 
> >> On my home network (IPv4 and IPv6), a box with Samba 4.2.10 with
IPv6
> >> disabled (via sysctl), will fail to contact a DC because the IPv6
> >> connect fails immediately before the v4 connect has a chance to
succeed.
> >> 
> >> I determined this by strace'ing the "net ads
workgroup" command, which
> >> resulted in the following:
> >> 
> >> 11:41:52 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 11
<0.000027>
> >> 11:41:52 fcntl(11, F_GETFL) = 0x2 (flags O_RDWR) <0.000015>
> >> 11:41:52 fcntl(11, F_SETFL, O_RDWR|O_NONBLOCK) = 0
<0.000016>
> >> 11:41:52 fcntl(11, F_GETFD) = 0 <0.000015>
> >> 11:41:52 fcntl(11, F_SETFD, FD_CLOEXEC) = 0 <0.000016>
> >> 11:41:52 connect(11, {sa_family=AF_INET, sin_port=htons(389),
> >> sin_addr=inet_addr("192.168.81.132")}, 16) = 0
<0.000050>
> >> 11:41:52 socket(PF_INET6, SOCK_DGRAM, IPPROTO_IP) = 12
<0.000025>
> >> 11:41:52 fcntl(12, F_GETFL) = 0x2 (flags O_RDWR) <0.000016>
> >> 11:41:52 fcntl(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0
<0.000015>
> >> 11:41:52 fcntl(12, F_GETFD) = 0 <0.000016>
> >> 11:41:52 fcntl(12, F_SETFD, FD_CLOEXEC) = 0 <0.000015>
> >> 11:41:52 setsockopt(12, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
<0.000018>
> >> 11:41:52 connect(12, {sa_family=AF_INET6, sin6_port=htons(389),
> >> inet_pton(AF_INET6, "2001:8b0:1627:1::2",
&sin6_addr), sin6_flowinfo=0,
> >> sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
> >> <0.000032>
> >> 11:41:52 close(12) = 0 <0.000028>
> >> 11:41:52 close(11) = 0 <0.000024>
> >> 11:41:52 close(10) = 0 <0.000020>
> >> 11:41:52 fcntl(8, F_SETLKW, {type=F_RDLCK, whence=SEEK_SET,
start=288,
> >> len=1}) = 0 <0.000021>
> >> 11:41:52 fcntl(8, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET,
start=288,
> >> len=1}) = 0 <0.000018>
> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET,
start=48,
> >> len=1}) = 0 <0.000019>
> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET,
start=48,
> >> len=1}) = 0 <0.000032>
> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET,
start=48,
> >> len=1}) = 0 <0.000019>
> >> 11:41:52 fcntl(9, F_SETLKW, {type=F_UNLCK, whence=SEEK_SET,
start=48,
> >> len=1}) = 0 <0.000017>
> >> 11:41:52 write(2, "ads_connect: No logon servers\n",
30ads_connect: No
> >> logon servers
> >> 
> >> As you can see, sockets 11 and 12 are setup to contact the DC, 11
to v4,
> >> and 12 to v6. connect() on socket 11 is successful (returns 0),
but
> >> connect() on socket 12 returns -1 due to "Network
unreachable" - this is
> >> correct as the box in question does not have IPv6.
> >> 
> >> The attempt is abandoned (implied by the immediate closing of
sockets 11
> >> and 12, and the writing of "No logon servers" to stderr)
before any
> >> attempt is made to talk on socket 11 (v4).
> >> 
> >> After futzing the box to have an IPv6 address with appropriate
routing,
> >> the attempt succeeds as expected. However, for reasons (too long
to go
> >> into here) this is not a solution, only a means of proving the
problem.
> >> 
> >> Since most DCs publish v6 records of some kind in DNS in an AD
setup
> >> these days, it would seem that this behaviour could do with
urgently
> >> fixing.
> >> 
> >> Any comments from the samba bods, or should I forward this on to
> >> samba-technical ?
> >> 
> >> Thanks
> >> 
> >> -- Rebecca Gellman
> >> 
> >> --
> >> To unsubscribe from this list go to the following URL and read the
> >> instructions:  https://lists.samba.org/mailman/options/samba
> 
>   
> -- 
> To unsubscribe from this list go to the following URL and read the
> instructions:  https://lists.samba.org/mailman/options/samba

Jeremy Allison

2016-Oct-17 17:13 UTC

head link

[Samba] Bug 6870 resurfaced in Samba 4.2.10

On Mon, Oct 17, 2016 at 09:41:10AM -0700, Jeremy Allison via samba
wrote:> On Mon, Oct 17, 2016 at 05:13:08PM +0100, Rebecca Gellman via samba wrote:
> >  
> > 
> > Hi, 
> > 
> > So I did some digging into the source code, and I think I've found
the
> > issue. Around line 120 of source3/libads/cldap.c: 
> > 
> > for (i=0; i<num_servers; i++) {
> >   NTSTATUS status; 
> > 
> >   status = cldap_socket_init(state->cldap,
> >     NULL, /* local_addr */
> >     state->servers[i],
> >     &state->cldap[i]); 
> > 
> >   if (tevent_req_nterror(req, status)) {
> >     return tevent_req_post(req, ev);
> >   } 
> > 
> >   /* Code omitted for brevity */ 
> > 
> > } 
> > 
> > This is in cldap_multi_netlogon_send(), a function that sends CLDAP
> > requests to multiple DCs in one go. The loop here sets up a socket for
> > each DC. cldap_socket_init() in turn (possibly several calls deeper)
> > sets up the UDP socket, and calls connect() on it, which fails with
> > "Network unreachable". This bubbles up the chain and comes
back to
> > cldap_multi_netlogon_send() as NT_STATUS_NETWORK_UNREACHABLE. 
> > 
> > Note however the return from the function: it returns an error if
*any*
> > of the servers queried returned an error, even if any of them
succeeded.
> 
> Great analysis - thanks ! I'll look into a patch for this.
> 
> We'll need a new bug report for this one.
OK, here's the new bug:

https://bugzilla.samba.org/show_bug.cgi?id=12381

and here is (I think) the patch. Can you test this
and let me know if it fixes your test case ?

 CC:ing to samba-technical for followups.

Cheers,

	Jeremy.

Reasonably Related Threads

Search for more maybe matching threads

samba - Oct 2016 - Bug 6870 resurfaced in Samba 4.2.10

[Samba] Bug 6870 resurfaced in Samba 4.2.10

[Samba] Bug 6870 resurfaced in Samba 4.2.10

[Samba] Bug 6870 resurfaced in Samba 4.2.10

Reasonably Related Threads