vincent at cojot.name
2025-Feb-11 17:23 UTC
[Samba] Floating IPs and Samba file servers (without ctdb)?
Warning: long post Hi everyone, I'm trying to revisit an older topic in the context of Samba AD/DC and file serving and wondering if anyone could shed some light.. I am struggling to make VIPs work for samba file sharing in an active directory environment serving Windows1* clients from RHEL fileservers joined to AD. Here's the setup: - Active Directory domain (forest level 2003 I think) with rfc2307 attributes for users. Based on Samba 4.20.7 and RHEL 8.10. Two DCs. - Several RHEL hosts serving 'replicated' content over NFS and Samba. No CTDB. All RHEL fileservers are on RHEL8.10. They are joined to AD using winbind. - There is a number of floating IPs (VIPs) floating across the RHEL fileservers and clients would be supposed to use the FQDN of those VIPs to access shares. - Windows 10 and 11 Clients joined to the realm and accessing fileshares from the RHEL servers. Here's what works and what doesn't: - <FQDN host server1>/share1 : Always works. - <IP of server1>/share1 : Always works. - <IP of VIP>/share1 : Always works. - <FQDN of VIP>/share1 : never works reliably. Sometimes it will work for a few hours and then the next day the shares are inaccessible. For now I've settled on <IP of VIP>/share1 because it's the only one which works reliably and 'floats' properly from server1 to server2 if a failure occurs. I've tried a few things: - I've created a serviceaccount in AD and added two SPNs to it: "host/<FQDN of VIP>" and "cifs/<FQDN of VIP>" - I've added the <FQDN of VIP> to 'additional dns hostnames' in the samba config file on each of the RHEL servers (server1, server2, server3...) - I've tried adding those SPNs to the keytab on the RHEL servers by using adcli like this: adcli -v update --computer-password-lifetime=0 --add-samba-data -D <AD REALM> --add-service-principal="host/<FQDN of VIP>" --add-service-principal="cifs/<FQDN of VIP>" If I use the following invocation, it works: # smbclient --use-kerberos=off -U raistlin //<FQDN of VIP>/World -c dir If I take --use-kerberos=off out, it fails with: ---------------------------------------------------------------------------------------------- gse_get_client_auth_token: gss_init_sec_context failed with [Unspecified GSS failure. Minor code may provide more information: The ticket isn't for us](2529638947) gensec_spnego_client_negTokenTarg_step: SPNEGO(gse_krb5) login failed: NT_STATUS_LOGON_FAILURE session setup failed: NT_STATUS_LOGON_FAILURE ---------------------------------------------------------------------------------------------- If I add --debuglevel=10 to the above, I see this (it seems to auth fine using kerberos but fails later in the process). ---------------------------------------------------------------------------------------------- mb_krb5_trace_cb: [3360310] 1739293354.582941: Initiating TCP connection to stream 10.0.131.248:88 smb_krb5_trace_cb: [3360310] 1739293354.582942: Sending TCP request to stream 10.0.131.248:88 smb_krb5_trace_cb: [3360310] 1739293354.582943: Received answer (1858 bytes) from stream 10.0.131.248:88 smb_krb5_trace_cb: [3360310] 1739293354.582944: Terminating TCP connection to stream 10.0.131.248:88 smb_krb5_trace_cb: [3360310] 1739293354.582945: Response was from master KDC smb_krb5_trace_cb: [3360310] 1739293354.582946: Processing preauth types: PA-ETYPE-INFO2 (19) smb_krb5_trace_cb: [3360310] 1739293354.582947: Selected etype info: etype aes256-cts, salt "AD.LASTHOME.SOLACE.KRYNNraistlin", params "\x00\x00\x10\x00" smb_krb5_trace_cb: [3360310] 1739293354.582948: Produced preauth for next request: (empty) smb_krb5_trace_cb: [3360310] 1739293354.582949: AS key determined by preauth: aes256-cts/D6F1 smb_krb5_trace_cb: [3360310] 1739293354.582950: Decrypted AS reply; session key is: aes256-cts/75CF smb_krb5_trace_cb: [3360310] 1739293354.582951: FAST negotiation: available kerberos_kinit_password_ext: raistlin at AD.LASTHOME.SOLACE.KRYNN mapped to raistlin at AD.LASTHOME.SOLACE.KRYNN smb_krb5_trace_cb: [3360310] 1739293354.582952: Initializing MEMORY:cliconnect with default princ raistlin at AD.LASTHOME.SOLACE.KRYNN smb_krb5_trace_cb: [3360310] 1739293354.582953: Storing raistlin at AD.LASTHOME.SOLACE.KRYNN -> krbtgt/AD.LASTHOME.SOLACE.KRYNN at AD.LASTHOME.SOLACE.KRYNN in MEMORY:cliconnect cli_session_creds_prepare_krb5: Successfully authenticated as raistlin at AD.LASTHOME.SOLACE.KRYNN (raistlin at AD.LASTHOME.SOLACE.KRYNN) to access rivendell.lasthome.solace.krynn using Kerberos cli_session_setup_spnego_send: Connect to rivendell.lasthome.solace.krynn as raistlin at AD.LASTHOME.SOLACE.KRYNN using SPNEGO GENSEC backend 'gssapi_spnego' registered GENSEC backend 'gssapi_krb5' registered GENSEC backend 'gssapi_krb5_sasl' registered GENSEC backend 'spnego' registered GENSEC backend 'schannel' registered GENSEC backend 'ncalrpc_as_system' registered GENSEC backend 'sasl-EXTERNAL' registered GENSEC backend 'ntlmssp' registered GENSEC backend 'ntlmssp_resume_ccache' registered GENSEC backend 'http_basic' registered GENSEC backend 'http_ntlm' registered GENSEC backend 'http_negotiate' registered Starting GENSEC mechanism spnego Starting GENSEC submechanism gse_krb5 gensec_update_send: gse_krb5[0x564217c7eb70]: subreq: 0x564217c46690 gensec_update_send: spnego[0x564217c44a20]: subreq: 0x564217c86c00 gensec_update_done: gse_krb5[0x564217c7eb70]: NT_STATUS_MORE_PROCESSING_REQUIRED tevent_req[0x564217c46690/../../source3/librpc/crypto/gse.c:896]: state[2] error[0 (0x0)] state[struct gensec_gse_update_state (0x564217c46870)] timer[(nil)] finish[../../source3/librpc/crypto/gse.c:906] gensec_update_done: spnego[0x564217c44a20]: NT_STATUS_MORE_PROCESSING_REQUIRED tevent_req[0x564217c86c00/../../auth/gensec/spnego.c:1632]: state[2] error[0 (0x0)] state[struct gensec_spnego_update_state (0x564217c86de0)] timer[(nil)] finish[../../auth/gensec/spnego.c:2116] gse_get_client_auth_token: gss_init_sec_context failed with [Unspecified GSS failure. Minor code may provide more information: The ticket isn't for us](2529638947) gensec_update_send: gse_krb5[0x564217c7eb70]: subreq: 0x564217c5e030 gensec_update_send: spnego[0x564217c44a20]: subreq: 0x564217c8caf0 gensec_update_done: gse_krb5[0x564217c7eb70]: NT_STATUS_LOGON_FAILURE tevent_req[0x564217c5e030/../../source3/librpc/crypto/gse.c:896]: state[3] error[-7963671676338569107 (0x917B5ACDC000006D)] state[struct gensec_gse_update_state (0x564217c5e210)] timer[(nil)] finish[../../source3/librpc/crypto/gse.c:909] gensec_spnego_client_negTokenTarg_step: SPNEGO(gse_krb5) login failed: NT_STATUS_LOGON_FAILURE gensec_update_done: spnego[0x564217c44a20]: NT_STATUS_LOGON_FAILURE tevent_req[0x564217c8caf0/../../auth/gensec/spnego.c:1632]: state[3] error[-7963671676338569107 (0x917B5ACDC000006D)] state[struct gensec_spnego_update_state (0x564217c8ccd0)] timer[(nil)] finish[../../auth/gensec/spnego.c:2039] SPNEGO login failed: The attempted logon is invalid. This is either due to a bad username or authentication information. session setup failed: NT_STATUS_LOGON_FAILURE ---------------------------------------------------------------------------------------------- The reason I'm not using CTDB is that I need to control where the VIPs float to (to which server) as there are supplemental services that need to switch to the same server as well. I am very lost and none of the litterature I've found (or even chatGPT which helped me debug this a little further) seems to help.. Any ideas? I'll take anything at this point... Vincent