Izan Díez Sánchez
2017-Feb-15 16:23 UTC
[Samba] Randomly losing network share file communication
Hi, Some users are experiencing problems working with files in Windows and Samba shares within engineering applications. The sequence is as follows: A user opens a file, e.g. a drawing, inside an application. The user works fine for a while, but suddenly it cannot edit the file anymore. The only way to continue working is closing and opening the file again, like if the session had expired and a new one needed to be opened. This issue never occurs working with local files and, at the same time, network shares work fine over most other cases (as have always done). We have been following the error for a while increasing the log level and we have been able to notice a pattern in the logs of the communication between the clients and the AD domain controllers related to the error the user gets:>File error reported at 09:54UNIX token of user 0 Primary group is 0 and contains 0 supplementary groups [2017/02/15 09:45:45.486764, 5] ../source3/smbd/uid.c:425(smbd_change_to_root_user) change_to_root_user: now uid=(0,0) gid=(0,0) [2017/02/15 09:45:45.486989, 3] ../source3/smbd/server_exit.c:246(exit_server_common) Server exit (NT_STATUS_CONNECTION_RESET) [2017/02/15 10:00:30.286739, 5] ../lib/dbwrap/dbwrap.c:178(dbwrap_check_lock_order) check lock order 2 for /usr/local/samba/var/lock/serverid.tdb [2017/02/15 10:00:30.286770, 5] ../lib/dbwrap/dbwrap.c:146(dbwrap_lock_order_state_destructor) release lock order 2 for /usr/local/samba/var/lock/serverid.tdb [2017/02/15 10:00:30.286828, 4] ../source3/smbd/sec_ctx.c:321(set_sec_ctx_internal)>File error reported at 10:43UNIX token of user 0 Primary group is 0 and contains 0 supplementary groups [2017/02/15 10:30:45.540081, 5] ../source3/smbd/uid.c:425(smbd_change_to_root_user) change_to_root_user: now uid=(0,0) gid=(0,0) [2017/02/15 10:30:45.540261, 3] ../source3/smbd/server_exit.c:246(exit_server_common) Server exit (NT_STATUS_CONNECTION_RESET) [2017/02/15 10:45:30.617833, 5] ../lib/dbwrap/dbwrap.c:178(dbwrap_check_lock_order) check lock order 2 for /usr/local/samba/var/lock/serverid.tdb [2017/02/15 10:45:30.617891, 5] ../lib/dbwrap/dbwrap.c:146(dbwrap_lock_order_state_destructor) release lock order 2 for /usr/local/samba/var/lock/serverid.tdb [2017/02/15 10:45:30.617991, 4] ../source3/smbd/sec_ctx.c:321(set_sec_ctx_internal)>File error reported at 11:11UNIX token of user 0 Primary group is 0 and contains 0 supplementary groups [2017/02/15 11:00:45.584903, 5] ../source3/smbd/uid.c:425(smbd_change_to_root_user) change_to_root_user: now uid=(0,0) gid=(0,0) [2017/02/15 11:00:45.585085, 3] ../source3/smbd/server_exit.c:246(exit_server_common) Server exit (NT_STATUS_CONNECTION_RESET) [2017/02/15 11:15:30.887581, 5] ../lib/dbwrap/dbwrap.c:178(dbwrap_check_lock_order) check lock order 2 for /usr/local/samba/var/lock/serverid.tdb [2017/02/15 11:15:30.887611, 5] ../lib/dbwrap/dbwrap.c:146(dbwrap_lock_order_state_destructor) release lock order 2 for /usr/local/samba/var/lock/serverid.tdb [2017/02/15 11:15:30.887671, 4] ../source3/smbd/sec_ctx.c:321(set_sec_ctx_internal) As you can see, there is some kind of connection termination (NT_STATUS_CONNECTION_RESET) before the error appears. Our guess is the client application not being able to notice the resetting of the communication or to keep track the file when restarting it. Active Directory implemented with samba4: -CentOS 6.7 -Samba Version 4.4.3 -BIND_DLZ 9.9.8 Working so far with very few issues. There is no trust relationship in play getting things difficult. We hope there may be some tuning parameter that can handle these special cases with delicate applications. Regards, Izan Díez Sánchez ids at empre.es This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose or take any action based on this message or any information herein. If you have received this message by mistake, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. Visit our Web page:(http://www.empre.es) Este mensaje puede contener información confidencial o privilegiada. Si Vd. no es el destinatario ni está autorizado por el mismo para recibir este mensaje, Vd. no debe usar, copiar, revelar ni tomar ninguna medida basada en este mensaje o en la información que contiene. Si Vd. ha recibido este mensaje por error, notifíquelo de forma inmediata al remitente por correo electrónico y borre el mensaje. Gracias por su cooperación.Visite nuestra página web:(http://www.empre.es) Please, Do not print this message unless it is necessary.Our environment is in our hands. Antes de imprimir este mensaje, asegúrese de que es necesario. El medio ambiente está en nuestra mano.
Jeremy Allison
2017-Feb-16 00:41 UTC
[Samba] Randomly losing network share file communication
On Wed, Feb 15, 2017 at 05:23:32PM +0100, Izan Díez Sánchez via samba wrote:> Hi, > > Some users are experiencing problems working with files in Windows and Samba > shares within engineering applications. The sequence is as follows: > A user opens a file, e.g. a drawing, inside an application. The user works > fine for a while, but suddenly it cannot edit the file anymore. The only way > to continue working is closing and opening the file again, like if the > session had expired and a new one needed to be opened.> [2017/02/15 09:45:45.486989, 3] > ../source3/smbd/server_exit.c:246(exit_server_common) > Server exit (NT_STATUS_CONNECTION_RESET)Get a wireshark trace. This usually means a TCP RST packet was receieved.
Ivan Sergio Borgonovo
2017-Feb-16 15:34 UTC
[Samba] Randomly losing network share file communication
On 02/16/2017 01:41 AM, Jeremy Allison via samba wrote:> On Wed, Feb 15, 2017 at 05:23:32PM +0100, Izan Díez Sánchez via samba wrote: >> Hi, >> >> Some users are experiencing problems working with files in Windows and Samba >> shares within engineering applications. The sequence is as follows: >> A user opens a file, e.g. a drawing, inside an application. The user works >> fine for a while, but suddenly it cannot edit the file anymore. The only way >> to continue working is closing and opening the file again, like if the >> session had expired and a new one needed to be opened.>> [2017/02/15 09:45:45.486989, 3] >> ../source3/smbd/server_exit.c:246(exit_server_common) >> Server exit (NT_STATUS_CONNECTION_RESET)> Get a wireshark trace. This usually means a TCP RST packet > was receieved.I've a similar problem. The problem started when I replaced a Linux server and started to run a newer kernel. As reported in a previous post (timeout after inactivity on mount.cifs 13/02/2017) I was mount.cifs and then tar directories on a Windows server 2008. Workaround was to mount the Win2008 share *just* before starting to tar. After delaying mounting the share just before using it I still have to experience the problem again. I can't blame the new server hardware. Connection with the Windows 2008 seems solid. The script that simply does mount + tar is the same. tar is not the culprit because mounting the share, leaving it unused for 30+ minutes doesn't let me ls into it. What seems to be left is the new software running on the Linux server. I don't know how many parts are involved in mount.cifs but up to my understanding the most probable culprit could be some regression in the kernel or some small print I didn't read and this is an expected behaviour that I simply didn't experience before. -- Ivan Sergio Borgonovo http://www.webthatworks.it http://www.borgonovo.net