A few times in the last couple months we have had our production Samba 3.0.14a-1server crash. We did not have detailed logging turned on at the time so the attached output of our log is the best process tracking I can give at the time of the crash. What will happen is our shares will no longer be accessible, and all the smbd processes will be hung. We can ssh to the server and do a restart on both smbd and nmbd and it acts like it is restarting them, but in fact if status is viewed on the smbd process after restarting it, all the old process ID's associated with it before it crashed will still be listed and it is still hung. As if it never was restarted and cleared out. After that point we just reboot the server and re-initialize everything to bring it back up. We have been using samba here for a while now and this is the first time I've ever had something like this happen over and over. We are running Fedora Core 3 with kernel version 2.6.11-1.14. At this point I'm just looking for some guidance for things to try and get this resolved. Any help would be appreciated. Thanks, Matt Lung MTD Corp. -------------- next part -------------- Jun 17 14:51:26 retainer smbd[8922]: [2005/06/17 14:51:26, 0] libsmb/clitrans.c:cli_receive_trans(190) Jun 17 14:51:26 retainer smbd[8922]: Expected SMBtrans response, got command 0x00 Jun 17 14:51:26 retainer smbd[8922]: [2005/06/17 14:51:26, 0] rpc_client/cli_pipe.c:cli_nt_session_open(1468) Jun 17 14:51:26 retainer smbd[8922]: cli_nt_session_open: pipe hnd state failed. Error was SUCCESS - 0 Jun 17 14:51:27 retainer smbd[8922]: [2005/06/17 14:51:27, 0] libsmb/clitrans.c:cli_receive_trans(190) Jun 17 14:51:27 retainer smbd[8922]: Expected SMBtrans response, got command 0x00 Jun 17 14:51:27 retainer smbd[8922]: [2005/06/17 14:51:27, 0] rpc_client/cli_pipe.c:cli_nt_session_open(1468) Jun 17 14:51:27 retainer smbd[8922]: cli_nt_session_open: pipe hnd state failed. Error was SUCCESS - 0 Jun 17 14:51:28 retainer smbd[8922]: [2005/06/17 14:51:28, 0] libsmb/clitrans.c:cli_receive_trans(190) Jun 17 14:51:28 retainer smbd[8922]: Expected SMBtrans response, got command 0x00 Jun 17 14:51:28 retainer smbd[8922]: [2005/06/17 14:51:28, 0] rpc_client/cli_pipe.c:cli_nt_session_open(1468) Jun 17 14:51:28 retainer smbd[8922]: cli_nt_session_open: pipe hnd state failed. Error was SUCCESS - 0 Jun 17 14:51:40 retainer smbd[8922]: [2005/06/17 14:51:40, 0] libsmb/clitrans.c:cli_receive_trans(190) Jun 17 14:51:40 retainer smbd[8922]: Expected SMBtrans response, got command 0x00 Jun 17 14:51:40 retainer smbd[8922]: [2005/06/17 14:51:40, 0] rpc_client/cli_pipe.c:cli_nt_session_open(1468) Jun 17 14:51:40 retainer smbd[8922]: cli_nt_session_open: pipe hnd state failed. Error was SUCCESS - 0 Jun 17 14:51:40 retainer smbd[8922]: [2005/06/17 14:51:40, 0] libsmb/clitrans.c:cli_receive_trans(190) Jun 17 14:51:40 retainer smbd[8922]: Expected SMBtrans response, got command 0x00 Jun 17 14:51:40 retainer smbd[8922]: [2005/06/17 14:51:40, 0] rpc_client/cli_pipe.c:cli_nt_session_open(1468) Jun 17 14:51:40 retainer smbd[8922]: cli_nt_session_open: pipe hnd state failed. Error was SUCCESS - 0 Jun 17 14:52:01 retainer crond(pam_unix)[13019]: session opened for user root by (uid=0) Jun 17 14:52:01 retainer crond(pam_unix)[13019]: session closed for user root Jun 17 14:53:01 retainer crond(pam_unix)[13062]: session opened for user root by (uid=0) Jun 17 14:53:01 retainer crond(pam_unix)[13062]: session closed for user root Jun 17 14:54:01 retainer crond(pam_unix)[13078]: session opened for user root by (uid=0) Jun 17 14:54:01 retainer crond(pam_unix)[13078]: session closed for user root Jun 17 14:54:03 retainer smbd[13082]: [2005/06/17 14:54:03, 0] lib/util_sock.c:get_peer_addr(1150) Jun 17 14:54:03 retainer smbd[13082]: getpeername failed. Error was Transport endpoint is not connected Jun 17 14:54:03 retainer smbd[13082]: [2005/06/17 14:54:03, 0] lib/util_sock.c:write_socket_data(430) Jun 17 14:54:03 retainer smbd[13082]: write_socket_data: write failure. Error = Connection reset by peer Jun 17 14:54:03 retainer smbd[13082]: [2005/06/17 14:54:03, 0] lib/util_sock.c:write_socket(455) Jun 17 14:54:03 retainer smbd[13082]: write_socket: Error writing 4 bytes to socket 5: ERRNO = Connection reset by peer Jun 17 14:54:03 retainer smbd[13082]: [2005/06/17 14:54:03, 0] lib/util_sock.c:send_smb(647) Jun 17 14:54:03 retainer smbd[13082]: Error writing 4 bytes to client. -1. (Connection reset by peer) Jun 17 14:55:01 retainer crond(pam_unix)[13103]: session opened for user root by (uid=0) Jun 17 14:55:01 retainer crond(pam_unix)[13103]: session closed for user root Jun 17 14:56:01 retainer crond(pam_unix)[13143]: session opened for user root by (uid=0) Jun 17 14:56:02 retainer crond(pam_unix)[13143]: session closed for user root Jun 17 14:57:01 retainer crond(pam_unix)[13168]: session opened for user root by (uid=0) Jun 17 14:57:01 retainer crond(pam_unix)[13168]: session closed for user root Jun 17 14:57:35 retainer smbd[13183]: [2005/06/17 14:57:35, 0] lib/util_sock.c:get_peer_addr(1150) Jun 17 14:57:35 retainer smbd[13183]: getpeername failed. Error was Transport endpoint is not connected Jun 17 14:57:35 retainer smbd[13183]: [2005/06/17 14:57:35, 0] lib/util_sock.c:write_socket_data(430) Jun 17 14:57:35 retainer smbd[13183]: write_socket_data: write failure. Error = Connection reset by peer Jun 17 14:57:35 retainer smbd[13183]: [2005/06/17 14:57:35, 0] lib/util_sock.c:write_socket(455) Jun 17 14:57:35 retainer smbd[13183]: write_socket: Error writing 4 bytes to socket 26: ERRNO = Connection reset by peer Jun 17 14:57:35 retainer smbd[13183]: [2005/06/17 14:57:35, 0] lib/util_sock.c:send_smb(647) Jun 17 14:57:35 retainer smbd[13183]: Error writing 4 bytes to client. -1. (Connection reset by peer) Jun 17 14:57:48 retainer smbd[13187]: [2005/06/17 14:57:48, 0] lib/util_sock.c:read_socket_data(384) Jun 17 14:57:48 retainer smbd[13187]: read_socket_data: recv failure for 4. Error = Connection reset by peer Jun 17 14:58:01 retainer crond(pam_unix)[13192]: session opened for user root by (uid=0) Jun 17 14:58:01 retainer crond(pam_unix)[13192]: session closed for user root Jun 17 14:59:01 retainer crond(pam_unix)[13208]: session opened for user root by (uid=0) Jun 17 14:59:01 retainer crond(pam_unix)[13208]: session closed for user root Jun 17 15:00:01 retainer crond(pam_unix)[13223]: session opened for user root by (uid=0) Jun 17 15:00:01 retainer crond(pam_unix)[13224]: session opened for user root by (uid=0) Jun 17 15:00:01 retainer crond(pam_unix)[13228]: session opened for user root by (uid=0) Jun 17 15:00:01 retainer crond(pam_unix)[13224]: session closed for user root Jun 17 15:00:01 retainer crond(pam_unix)[13223]: session closed for user root Jun 17 15:00:55 retainer smbd[13252]: [2005/06/17 15:00:55, 0] lib/util_sock.c:write_socket_data(430) Jun 17 15:00:55 retainer smbd[13252]: write_socket_data: write failure. Error = Connection reset by peer Jun 17 15:00:55 retainer smbd[13252]: [2005/06/17 15:00:55, 0] lib/util_sock.c:write_socket(455) Jun 17 15:00:55 retainer smbd[13252]: write_socket: Error writing 4 bytes to socket 5: ERRNO = Connection reset by peer Jun 17 15:00:55 retainer smbd[13252]: [2005/06/17 15:00:55, 0] lib/util_sock.c:send_smb(647) Jun 17 15:00:55 retainer smbd[13252]: Error writing 4 bytes to client. -1. (Connection reset by peer) Jun 17 15:01:01 retainer crond(pam_unix)[13256]: session opened for user root by (uid=0) Jun 17 15:01:01 retainer crond(pam_unix)[13258]: session opened for user root by (uid=0) Jun 17 15:01:01 retainer crond(pam_unix)[13256]: session closed for user root Jun 17 15:01:01 retainer crond(pam_unix)[13258]: session closed for user root Jun 17 15:01:21 retainer kernel: lease broken - owner pid = 8349 Jun 17 15:02:01 retainer crond(pam_unix)[13275]: session opened for user root by (uid=0) Jun 17 15:02:01 retainer crond(pam_unix)[13275]: session closed for user root Jun 17 15:02:03 retainer smbd[13278]: [2005/06/17 15:02:03, 0] lib/util_sock.c:write_socket_data(430) Jun 17 15:02:03 retainer smbd[13278]: write_socket_data: write failure. Error = Connection reset by peer Jun 17 15:02:03 retainer smbd[13278]: [2005/06/17 15:02:03, 0] lib/util_sock.c:write_socket(455) Jun 17 15:02:03 retainer smbd[13278]: write_socket: Error writing 4 bytes to socket 26: ERRNO = Connection reset by peer Jun 17 15:02:03 retainer smbd[13278]: [2005/06/17 15:02:03, 0] lib/util_sock.c:send_smb(647) Jun 17 15:02:03 retainer smbd[13278]: Error writing 4 bytes to client. -1. (Connection reset by peer) Jun 17 15:02:43 retainer crond(pam_unix)[13228]: session closed for user root Jun 17 15:03:01 retainer crond(pam_unix)[13290]: session opened for user root by (uid=0) Jun 17 15:03:01 retainer crond(pam_unix)[13290]: session closed for user root Jun 17 15:04:01 retainer crond(pam_unix)[13306]: session opened for user root by (uid=0) Jun 17 15:04:01 retainer crond(pam_unix)[13306]: session closed for user root Jun 17 15:05:01 retainer crond(pam_unix)[13319]: session opened for user root by (uid=0) Jun 17 15:05:01 retainer crond(pam_unix)[13319]: session closed for user root Jun 17 15:06:01 retainer crond(pam_unix)[13332]: session opened for user root by (uid=0) Jun 17 15:06:01 retainer crond(pam_unix)[13332]: session closed for user root Jun 17 15:06:49 retainer smbd[13344]: [2005/06/17 15:06:49, 0] lib/util_sock.c:write_socket_data(430) Jun 17 15:06:49 retainer smbd[13344]: write_socket_data: write failure. Error = Connection reset by peer Jun 17 15:06:49 retainer smbd[13344]: [2005/06/17 15:06:49, 0] lib/util_sock.c:write_socket(455) Jun 17 15:06:49 retainer smbd[13344]: write_socket: Error writing 4 bytes to socket 5: ERRNO = Connection reset by peer Jun 17 15:06:49 retainer smbd[13344]: [2005/06/17 15:06:49, 0] lib/util_sock.c:send_smb(647) Jun 17 15:06:49 retainer smbd[13344]: Error writing 4 bytes to client. -1. (Connection reset by peer) Jun 17 15:07:01 retainer crond(pam_unix)[13348]: session opened for user root by (uid=0) Jun 17 15:07:01 retainer crond(pam_unix)[13348]: session closed for user root Jun 17 15:08:01 retainer crond(pam_unix)[13362]: session opened for user root by (uid=0) Jun 17 15:08:01 retainer crond(pam_unix)[13362]: session closed for user root Jun 17 15:09:01 retainer crond(pam_unix)[13375]: session opened for user root by (uid=0) Jun 17 15:09:01 retainer crond(pam_unix)[13375]: session closed for user root Jun 17 15:09:24 retainer login(pam_unix)[3848]: session opened for user root by (uid=0) Jun 17 15:09:24 retainer -- root[3848]: ROOT LOGIN ON tty1 Jun 17 15:10:01 retainer crond(pam_unix)[13437]: session opened for user root by (uid=0) Jun 17 15:10:01 retainer crond(pam_unix)[13439]: session opened for user root by (uid=0) Jun 17 15:10:02 retainer crond(pam_unix)[13439]: session closed for user root Jun 17 15:10:02 retainer crond(pam_unix)[13437]: session closed for user root Jun 17 15:11:01 retainer crond(pam_unix)[13453]: session opened for user root by (uid=0) Jun 17 15:11:01 retainer crond(pam_unix)[13453]: session closed for user root Jun 17 15:12:01 retainer crond(pam_unix)[13468]: session opened for user root by (uid=0) Jun 17 15:12:01 retainer crond(pam_unix)[13468]: session closed for user root Jun 17 15:13:01 retainer crond(pam_unix)[13482]: session opened for user root by (uid=0) Jun 17 15:13:01 retainer crond(pam_unix)[13482]: session closed for user root Jun 17 15:14:01 retainer crond(pam_unix)[13494]: session opened for user root by (uid=0) Jun 17 15:14:01 retainer crond(pam_unix)[13494]: session closed for user root Jun 17 15:15:01 retainer crond(pam_unix)[13509]: session opened for user root by (uid=0) Jun 17 15:15:01 retainer crond(pam_unix)[13510]: session opened for user postgres by (uid=0) Jun 17 15:15:01 retainer crond(pam_unix)[13509]: session closed for user root Jun 17 15:15:36 retainer login(pam_unix)[3849]: session opened for user root by (uid=0) Jun 17 15:15:36 retainer -- root[3849]: ROOT LOGIN ON tty2 Jun 17 15:15:57 retainer smb: smbd -TERM succeeded Jun 17 15:15:57 retainer nmbd[3645]: [2005/06/17 15:15:57, 0] nmbd/nmbd.c:terminate(56) Jun 17 15:15:57 retainer smb: nmbd -TERM succeeded Jun 17 15:15:57 retainer nmbd[3645]: Got SIGTERM: going down... Jun 17 15:15:58 retainer smbd[13632]: [2005/06/17 15:15:58, 0] printing/print_cups.c:cups_cache_reload(85) Jun 17 15:15:58 retainer smbd[13632]: Unable to connect to CUPS server localhost - Connection refused Jun 17 15:15:58 retainer smbd[13632]: [2005/06/17 15:15:58, 0] printing/print_cups.c:cups_cache_reload(85) Jun 17 15:15:58 retainer smbd[13632]: Unable to connect to CUPS server localhost - Connection refused Jun 17 15:15:58 retainer smb: smbd startup succeeded Jun 17 15:15:58 retainer smb: nmbd startup succeeded Jun 17 15:15:58 retainer smbd[13642]: [2005/06/17 15:15:58, 0] lib/util_sock.c:get_peer_addr(1150) Jun 17 15:15:58 retainer smbd[13642]: getpeername failed. Error was Transport endpoint is not connected Jun 17 15:15:58 retainer smbd[13642]: [2005/06/17 15:15:58, 0] lib/util_sock.c:write_socket_data(430) Jun 17 15:15:58 retainer smbd[13642]: write_socket_data: write failure. Error = Connection reset by peer Jun 17 15:15:58 retainer smbd[13642]: [2005/06/17 15:15:58, 0] lib/util_sock.c:write_socket(455) Jun 17 15:15:58 retainer smbd[13642]: write_socket: Error writing 4 bytes to socket 26: ERRNO = Connection reset by peer Jun 17 15:15:58 retainer smbd[13642]: [2005/06/17 15:15:58, 0] lib/util_sock.c:send_smb(647) Jun 17 15:15:58 retainer smbd[13642]: Error writing 4 bytes to client. -1. (Connection reset by peer) Jun 17 15:16:01 retainer crond(pam_unix)[13650]: session opened for user root by (uid=0) Jun 17 15:16:01 retainer crond(pam_unix)[13650]: session closed for user root Jun 17 15:16:12 retainer crond(pam_unix)[13510]: session closed for user postgres Jun 17 15:17:01 retainer crond(pam_unix)[13704]: session opened for user root by (uid=0) Jun 17 15:17:01 retainer crond(pam_unix)[13704]: session closed for user root Jun 17 15:17:05 retainer smbd[13708]: [2005/06/17 15:17:05, 0] lib/util_sock.c:read_socket_data(384) Jun 17 15:17:05 retainer smbd[13708]: read_socket_data: recv failure for 4. Error = Connection reset by peer Jun 17 15:18:01 retainer crond(pam_unix)[13723]: session opened for user root by (uid=0) Jun 17 15:18:01 retainer crond(pam_unix)[13723]: session closed for user root Jun 17 15:18:08 retainer shutdown: shutting down for system reboot