Emanuel M. Di Vita
2021-Jan-26 09:46 UTC
[Samba] Fwd: Share stops working during a data migration
Hello people, I've a problem with samba v4.9.5 during a data migration. I'm migrating data from a windows server 2016 client to a samba share (samba is joined to an AD) and the software I'm using to migrate these data is using the node hostname instead of using the IP or FQDN (example: //sambanode/mysharename) to connect to the share. The migration software use an AD user to write data in the share. When the migration task starts, everything looks good and behave good, but after some times it start to fail copying objects. From the samba client-related logs I see the following messages when the issue occur: [2021/01/25 13:19:31.043492,? 5] ../auth/gensec/gensec_start.c:739(gensec_start_mech) ? Starting GENSEC submechanism ntlmssp [2021/01/25 13:19:31.043514,? 2] ../auth/ntlmssp/ntlmssp.c:119(gensec_ntlmssp_update_find) ? Failed to parse NTLMSSP packet: zero length [2021/01/25 13:19:31.043532,? 4] ../source3/smbd/sec_ctx.c:438(pop_sec_ctx) ? pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 0 [2021/01/25 13:19:31.043553,? 5] ../auth/gensec/gensec.c:492(gensec_update_done) ? gensec_update_done: ntlmssp[816c460]: NT_STATUS_INVALID_PARAMETER [2021/01/25 13:19:31.043570,? 1] ../auth/gensec/spnego.c:1218(gensec_spnego_server_negTokenInit_step) ? gensec_spnego_server_negTokenInit_step: ntlmssp: parsing NEG_TOKEN_INIT content failed (next[(null)]): NT_STATUS_INVALID_PARAMETER [2021/01/25 13:19:31.043588,? 5] ../auth/gensec/gensec.c:492(gensec_update_done) ? gensec_update_done: spnego[82f32f8]: NT_STATUS_INVALID_PARAMETER [2021/01/25 13:19:31.043605,? 4] ../source3/smbd/sec_ctx.c:216(push_sec_ctx) ? push_sec_ctx(0, 0) : sec_ctx_stack_ndx = 1 [2021/01/25 13:19:31.043618,? 4] ../source3/smbd/uid.c:558(push_conn_ctx) ? push_conn_ctx(0) : conn_ctx_stack_ndx = 0 [2021/01/25 13:19:31.043630,? 4] ../source3/smbd/sec_ctx.c:320(set_sec_ctx_internal) ? setting sec ctx (0, 0) - sec_ctx_stack_ndx = 1 [2021/01/25 13:19:31.043642,? 5] ../libcli/security/security_token.c:53(security_token_debug) ? Security token: (NULL) [2021/01/25 13:19:31.043654,? 5] ../source3/auth/token_util.c:866(debug_unix_user_token) ? UNIX token of user 0 ? Primary group is 0 and contains 0 supplementary groups [2021/01/25 13:19:31.043680,? 4] ../source3/smbd/sec_ctx.c:438(pop_sec_ctx) ? pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 0 [2021/01/25 13:19:31.043702,? 3] ../source3/smbd/smb2_server.c:3195(smbd_smb2_request_error_ex) ? smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] status[NT_STATUS_INVALID_PARAMETER] || at ../source3/smbd/smb2_sesssetup.c:137 These messages are repeated time after time. The effect is that I cannot access the share anymore using the hostname, but if I try to access the share using the node IP address or the FQDN - example: sambanode.my-domain.local - it works (side note: the windows client start to reach the samba share again after two or three hours, more or less. A windows reboot do not solve and it is not a caching issue, because the windows node doesn't have the share mapped as a drive). Another strange things is that in the mean time I can access the shares from other clients using the hostname. This is the global section in the smb.conf I'm using: [global] client ldap sasl wrapping = plain dedicated keytab file = /etc/krb5.keytab disable spoolss = yes host msdfs = no idmap config * : backend = tdb idmap config * : range = 30000-40000 idmap config * : schema_mode = rfc2307 idmap config my-domain : backend = rid idmap config my-domain : range = 1000000-20000000 idmap config my-domain : schema_mode = rfc2307 kerberos method = secrets and keytab load printers = no local master = no log file = /opt/samba/log/%m.log log level = 5 map acl inherit = Yes map to guest = bad user max log size = 100000 preferred master = no printcap name = /dev/null realm = my-domain.local security = ads server string = Data %h store dos attributes = Yes vfs objects = zfsacl winbind enum groups = yes winbind enum users = yes winbind expand groups = 4 winbind nested groups = yes winbind normalize names = no winbind nss info = rfc2307 winbind refresh tickets = Yes winbind use default domain = no workgroup = MY-DOMAIN I've found this bugs: https://bugzilla.samba.org/show_bug.cgi?id=14106 that looks very similar to the issue I'm experiencing now, but it's not clear to me if it really match my case. Also this bug was rised, but by redhat, for the same issue: https://bugzilla.redhat.com/show_bug.cgi?id=1657428 but I don't know if they are linked/related in some ways. Do you have any idea/suggestions? If the bug 14106 is the one I'm experiencing right now, could you please so kind to write the version which fix the problem (4.10 probably?)? Is there any workaround I could put in place to avoid this issue without the upgrade samba? Thanks in advance and best regards, Emanuel
Rowland penny
2021-Jan-26 10:18 UTC
[Samba] Fwd: Share stops working during a data migration
On 26/01/2021 09:46, Emanuel M. Di Vita via samba wrote:> Hello people, > I've a problem with samba v4.9.5 during a data migration. > I'm migrating data from a windows server 2016 client to a samba share > (samba is joined to an AD) and the software I'm using to migrate these > data is using the node hostname instead of using the IP or FQDN > (example: //sambanode/mysharename) to connect to the share. The > migration software use an AD user to write data in the share. > > When the migration task starts, everything looks good and behave good, > but after some times it start to fail copying objects. > From the samba client-related logs I see the following messages when > the issue occur: > > [2021/01/25 13:19:31.043492,? 5] > ../auth/gensec/gensec_start.c:739(gensec_start_mech) > ? Starting GENSEC submechanism ntlmssp > [2021/01/25 13:19:31.043514,? 2] > ../auth/ntlmssp/ntlmssp.c:119(gensec_ntlmssp_update_find) > ? Failed to parse NTLMSSP packet: zero length > [2021/01/25 13:19:31.043532,? 4] > ../source3/smbd/sec_ctx.c:438(pop_sec_ctx) > ? pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 0 > [2021/01/25 13:19:31.043553,? 5] > ../auth/gensec/gensec.c:492(gensec_update_done) > ? gensec_update_done: ntlmssp[816c460]: NT_STATUS_INVALID_PARAMETER > [2021/01/25 13:19:31.043570,? 1] > ../auth/gensec/spnego.c:1218(gensec_spnego_server_negTokenInit_step) > ? gensec_spnego_server_negTokenInit_step: ntlmssp: parsing > NEG_TOKEN_INIT content failed (next[(null)]): NT_STATUS_INVALID_PARAMETER > [2021/01/25 13:19:31.043588,? 5] > ../auth/gensec/gensec.c:492(gensec_update_done) > ? gensec_update_done: spnego[82f32f8]: NT_STATUS_INVALID_PARAMETER > [2021/01/25 13:19:31.043605,? 4] > ../source3/smbd/sec_ctx.c:216(push_sec_ctx) > ? push_sec_ctx(0, 0) : sec_ctx_stack_ndx = 1 > [2021/01/25 13:19:31.043618,? 4] ../source3/smbd/uid.c:558(push_conn_ctx) > ? push_conn_ctx(0) : conn_ctx_stack_ndx = 0 > [2021/01/25 13:19:31.043630,? 4] > ../source3/smbd/sec_ctx.c:320(set_sec_ctx_internal) > ? setting sec ctx (0, 0) - sec_ctx_stack_ndx = 1 > [2021/01/25 13:19:31.043642,? 5] > ../libcli/security/security_token.c:53(security_token_debug) > ? Security token: (NULL) > [2021/01/25 13:19:31.043654,? 5] > ../source3/auth/token_util.c:866(debug_unix_user_token) > ? UNIX token of user 0 > ? Primary group is 0 and contains 0 supplementary groups > [2021/01/25 13:19:31.043680,? 4] > ../source3/smbd/sec_ctx.c:438(pop_sec_ctx) > ? pop_sec_ctx (0, 0) - sec_ctx_stack_ndx = 0 > [2021/01/25 13:19:31.043702,? 3] > ../source3/smbd/smb2_server.c:3195(smbd_smb2_request_error_ex) > ? smbd_smb2_request_error_ex: smbd_smb2_request_error_ex: idx[1] > status[NT_STATUS_INVALID_PARAMETER] || at > ../source3/smbd/smb2_sesssetup.c:137 > > These messages are repeated time after time. > The effect is that I cannot access the share anymore using the > hostname, but if I try to access the share using the node IP address > or the FQDN - example: sambanode.my-domain.local - it works (side > note: the windows client start to reach the samba share again after > two or three hours, more or less. A windows reboot do not solve and it > is not a caching issue, because the windows node doesn't have the > share mapped as a drive). > Another strange things is that in the mean time I can access the > shares from other clients using the hostname. > > This is the global section in the smb.conf I'm using: > > [global] > client ldap sasl wrapping = plain > dedicated keytab file = /etc/krb5.keytab > disable spoolss = yes > host msdfs = no > idmap config * : backend = tdb > idmap config * : range = 30000-40000 > idmap config * : schema_mode = rfc2307 > idmap config my-domain : backend = rid > idmap config my-domain : range = 1000000-20000000 > idmap config my-domain : schema_mode = rfc2307 > kerberos method = secrets and keytab > load printers = no > local master = no > log file = /opt/samba/log/%m.log > log level = 5 > map acl inherit = Yes > map to guest = bad user > max log size = 100000 > preferred master = no > printcap name = /dev/null > realm = my-domain.local > security = ads > server string = Data %h > store dos attributes = Yes > vfs objects = zfsacl > winbind enum groups = yes > winbind enum users = yes > winbind expand groups = 4 > winbind nested groups = yes > winbind normalize names = no > winbind nss info = rfc2307 > winbind refresh tickets = Yes > winbind use default domain = no > workgroup = MY-DOMAIN > > I've found this bugs: https://bugzilla.samba.org/show_bug.cgi?id=14106 > that looks very similar to the issue I'm experiencing now, but it's > not clear to me if it really match my case. > Also this bug was rised, but by redhat, for the same issue: > https://bugzilla.redhat.com/show_bug.cgi?id=1657428 but I don't know > if they are linked/related in some ways. > > Do you have any idea/suggestions? > If the bug 14106 is the one I'm experiencing right now, could you > please so kind to write the version which fix the problem (4.10 > probably?)? > Is there any workaround I could put in place to avoid this issue > without the upgrade samba? > > Thanks in advance and best regards, > Emanuel > >The problem here is threefold, the first is very minor, you have lines in your smb.conf that are not required, but this probably has nothing to do with your problem. Your main problem is the use of ZFS and Samba 4.9.5, there have been numerous code changes since 4.9.5 was released, some around the use of ZFS and Samba. I would suggest you upgrade to the highest supported samba version that you can (4.11.x, 4.12.x, 4.13.x). If there is still a bug, then you stand a chance of getting it fixed, you have little or no chance of getting a bug in 4.9.5 fixed, it is EOL. Rowland