Zdenek Pizl
2005-May-15 10:50 UTC
[Samba] Samba (RHEL 3) hangs after few days or even hours
Hallo all, Last week I am trying to solve strange problem. Suddenly without no system config change the samba stops work properly. Symptoms: - it hangs after few days or hours (a worse case), there are many processes (from hundreds to thousands of smbd processes) - no new connection can be established - it happend regardless of samba package version, I've tried all available versions from 3.0.6-2.3E to 3.0.14a-1 System info: - RHEL 3 Update 4 + available updates - computer is a domain member controlled by another samba (2.2.8a) with ldap backend) - problematic computer is a file server under medium/high load (about 50 clients read and write to it gigabytes of data) /etc/samba/smb.conf # Global parameters [global] workgroup = SYSTINET server string = Buildsrv security = DOMAIN password server = 10.0.0.3 encrypted passwords = yes log level = 10 log file = /var/log/samba/%m.log max log size = 5000 wins server = 10.0.0.3 smb ports = 139 445 name resolve order = hosts wins bcast max smbd processes = 5000 socket options = IPTOS_LOWDELAY TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192 SO_KEEPALIVE load printers = No dns proxy = No ldap ssl = no read raw = no level2 oplocks = true domain master = no domain logons = no local master = no dont descend = /proc,/dev,/lost+found,/BUILD/lost+found [buildsrv] comment = Share to store data etc. path = /BUILD force user = builder force group = users read only = No create mask = 0644 directory mask = 0775 force directory mode = 0775 guest ok = Yes use sendfile = Yes write cache size = 65536 I noticed weird messages in nmbd.log: [2005/05/15 02:50:27, 4] nmbd/nmbd_workgroupdb.c:find_workgroup_on_subnet(162) find_workgroup_on_subnet: workgroup search for SYSTINET on subnet UNICAST_SUBNET: found. [2005/05/15 02:50:27, 4] nmbd/nmbd_workgroupdb.c:find_workgroup_on_subnet(162) find_workgroup_on_subnet: workgroup search for SYSTINET on subnet UNICAST_SUBNET: found. [2005/05/15 02:50:27, 10] lib/util_sock.c:read_udp_socket(230) read_udp_socket: lastip 10.0.0.209 lastport 138 read: 215 [2005/05/15 02:50:27, 5] libsmb/nmblib.c:read_packet(757) Received a packet of len 215 from (10.0.0.209) port 138 [2005/05/15 02:50:27, 10] nmbd/nmbd_subnetdb.c:namelist_entry_compare(69) nmbd_subnetdb:namelist_entry_compare() -1 == memcmp( "SYSTINET<1d>", "SYSTINET<1e>", 84 ) [2005/05/15 02:50:27, 10] nmbd/nmbd_subnetdb.c:namelist_entry_compare(69) nmbd_subnetdb:namelist_entry_compare() 1 == memcmp( "SYSTINET<1d>", "SYSTINET<00>", 84 ) [2005/05/15 02:50:27, 9] nmbd/nmbd_namelistdb.c:find_name_on_subnet(129) find_name_on_subnet: on subnet 10.0.0.56 - name SYSTINET<1d> NOT FOUND [2005/05/15 02:50:27, 5] nmbd/nmbd_packets.c:process_dgram(1194) process_dgram: ignoring dgram packet sent to name SYSTINET<1d> from 10.0.0.209 [2005/05/15 09:04:26, 4] nmbd/nmbd_workgroupdb.c:dump_workgroups(271) dump_workgroups() dump workgroup on subnet 10.0.0.56: netmask= 255.255.254.0: SYSTINET(1) current master browser = AUTHCZ BUILDSRV 40009b03 (Buildsrv) AUTHCZ 400c9b2b (AuthCZ PDC) [2005/05/15 09:04:26, 4] nmbd/nmbd_workgroupdb.c:dump_workgroups(271) dump_workgroups() dump workgroup on subnet UNICAST_SUBNET: netmask= 0.0.0.0: SYSTINET(1) current master browser = UNKNOWN BUILDSRV 40009b03 (Buildsrv) [2005/05/15 09:04:26, 4] nmbd/nmbd_workgroupdb.c:find_workgroup_on_subnet(162) find_workgroup_on_subnet: workgroup search for SYSTINET on subnet UNICAST_SUBNET: found. [2005/05/15 09:04:26, 4] nmbd/nmbd_workgroupdb.c:find_workgroup_on_subnet(162) find_workgroup_on_subnet: workgroup search for SYSTINET on subnet UNICAST_SUBNET: found. [2005/05/15 09:04:36, 4] nmbd/nmbd_workgroupdb.c:find_workgroup_on_subnet(162) find_workgroup_on_subnet: workgroup search for SYSTINET on subnet 10.0.0.56: found. [2005/05/15 09:04:36, 10] nmbd/nmbd_sendannounce.c:announce_myself_to_domain_master_browser(382) announce_myself_to_domain_master_browser: t (1116140666) - last(1116139866) < 900 [2005/05/15 09:04:36, 4] nmbd/nmbd_workgroupdb.c:dump_workgroups(271) dump_workgroups() dump workgroup on subnet 10.0.0.56: netmask= 255.255.254.0: SYSTINET(1) current master browser = AUTHCZ BUILDSRV 40009b03 (Buildsrv) AUTHCZ 400c9b2b (AuthCZ PDC) [2005/05/15 09:04:36, 4] nmbd/nmbd_workgroupdb.c:dump_workgroups(271) dump_workgroups() dump workgroup on subnet UNICAST_SUBNET: netmask= 0.0.0.0: SYSTINET(1) current master browser = UNKNOWN BUILDSRV 40009b03 (Buildsrv) Does anybody know what's going on there? I have no clue. Thank you in advance, z.p. -- Zdenek Pizl Systinet Corporation Vinohradska 190 13000 Praha 3