Hi, I don't stop to have problems with samba :s ... Now after having workaround the bug of scanning all LDAP users for each connexion... smbd crash very often. In some workstation log files i can see something like this : ... [2009/10/01 16:28:12, 2] smbd/open.c:580(open_file) baala opened file .profiles/firefox/cookies.sqlite-journal read=No write=No (numopen=20) [2009/10/01 16:28:12, 2] smbd/close.c:612(close_normal_file) baala closed file .profiles/firefox/cookies.sqlite-journal (numopen=19) NT_STATUS_OK *** glibc detected *** /usr/sbin/smbd: realloc(): invalid next size: 0x0955c5c8 *** ======= Backtrace: ========/lib/tls/i686/cmov/libc.so.6[0xb7cca604] /lib/tls/i686/cmov/libc.so.6[0xb7cce1b1] /lib/tls/i686/cmov/libc.so.6(realloc+0x106)[0xb7cceee6] /usr/sbin/smbd(Realloc+0x7d)[0x834326d] /usr/sbin/smbd(brl_lock+0x4a3)[0x82d1f23] /usr/sbin/smbd(do_lock+0x147)[0x82cc517] /usr/sbin/smbd[0x8120467] /usr/sbin/smbd[0x8121e7a] /usr/sbin/smbd(reply_trans2+0x6ef)[0x8123b5f] /usr/sbin/smbd[0x8145848] /usr/sbin/smbd[0x81481ad] /usr/sbin/smbd[0x8148bd2] /usr/sbin/smbd(run_events+0x13c)[0x8353cac] /usr/sbin/smbd(smbd_process+0x791)[0x8147cd1] /usr/sbin/smbd[0x8623a25] /usr/sbin/smbd(run_events+0x13c)[0x8353cac] /usr/sbin/smbd[0x8353f4e] /usr/sbin/smbd(_tevent_loop_once+0x9b)[0x835458b] /usr/sbin/smbd(main+0xc12)[0x8624732] /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7c71775] /usr/sbin/smbd[0x80c3e91] ======= Memory map: =======... The problem is when this event appear, the smbd of the user crash and put the smbd service in a unusable state... All users already connected can continue to work but no more connexion are allowed, and i see lot of "smbd <defunct>" process apprearing. At this time the samba 3.4.1 (and 3.4.2) are unusable at all beacause this behavior appears many dozens time per day... I've tested all the hardware of the server and no problem. The server was equiped with 2 QuadCore E5430 @ 2.66 and 4Gb of memory. I've got 64 Workstations into the domain (all workstation have dual-boot WinXP Pro / Ubuntu 9.04), and about 600 users. Any idea ? Regards, Bruno -- Bruno MACADRE ------------------------------------------------------------------- Ing?nieur Syst?mes et R?seau | Systems and Network Engineer D?partement Informatique | Department of computer science Responsable R?seau et T?l?phonie | Telecom and Network Manager Universit? de Rouen | University of Rouen ------------------------------------------------------------------- Coordonn?es / Contact : Universit? de Rouen Facult? des Sciences et Techniques - Madrillet Avenue de l'Universit? - BP12 76801 St Etienne du Rouvray CEDEX T?l : +33 (0)2-32-95-51-86 Fax : +33 (0)2-32-95-51-87 -------------------------------------------------------------------
On Fri, Oct 02, 2009 at 10:43:07AM +0200, Bruno MACADRE wrote:> I don't stop to have problems with samba :s ... > > Now after having workaround the bug of scanning all LDAP users for each > connexion... smbd crash very often. > > In some workstation log files i can see something like this :Can you run smbd with debug level 10 and send a few thousand lines before that crash? Volker -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: <http://lists.samba.org/pipermail/samba/attachments/20091002/b8d0e026/attachment.pgp>
Bruno MACADRE a ?crit :> Hi, > > I don't stop to have problems with samba :s ... > > Now after having workaround the bug of scanning all LDAP users for each > connexion... smbd crash very often. > > In some workstation log files i can see something like this : > > ... > [2009/10/01 16:28:12, 2] smbd/open.c:580(open_file) > baala opened file .profiles/firefox/cookies.sqlite-journal read=No > write=No (numopen=20) > [2009/10/01 16:28:12, 2] smbd/close.c:612(close_normal_file) > baala closed file .profiles/firefox/cookies.sqlite-journal > (numopen=19) NT_STATUS_OK > *** glibc detected *** /usr/sbin/smbd: realloc(): invalid next size: > 0x0955c5c8 *** > ======= Backtrace: ========> /lib/tls/i686/cmov/libc.so.6[0xb7cca604] > /lib/tls/i686/cmov/libc.so.6[0xb7cce1b1] > /lib/tls/i686/cmov/libc.so.6(realloc+0x106)[0xb7cceee6] > /usr/sbin/smbd(Realloc+0x7d)[0x834326d] > /usr/sbin/smbd(brl_lock+0x4a3)[0x82d1f23] > /usr/sbin/smbd(do_lock+0x147)[0x82cc517] > /usr/sbin/smbd[0x8120467] > /usr/sbin/smbd[0x8121e7a] > /usr/sbin/smbd(reply_trans2+0x6ef)[0x8123b5f] > /usr/sbin/smbd[0x8145848] > /usr/sbin/smbd[0x81481ad] > /usr/sbin/smbd[0x8148bd2] > /usr/sbin/smbd(run_events+0x13c)[0x8353cac] > /usr/sbin/smbd(smbd_process+0x791)[0x8147cd1] > /usr/sbin/smbd[0x8623a25] > /usr/sbin/smbd(run_events+0x13c)[0x8353cac] > /usr/sbin/smbd[0x8353f4e] > /usr/sbin/smbd(_tevent_loop_once+0x9b)[0x835458b] > /usr/sbin/smbd(main+0xc12)[0x8624732] > /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7c71775] > /usr/sbin/smbd[0x80c3e91] > ======= Memory map: =======> ... > > The problem is when this event appear, the smbd of the user crash and > put the smbd service in a unusable state... All users already connected > can continue to work but no more connexion are allowed, and i see lot of > "smbd <defunct>" process apprearing. > > At this time the samba 3.4.1 (and 3.4.2) are unusable at all beacause > this behavior appears many dozens time per day... I've tested all the > hardware of the server and no problem. The server was equiped with 2 > QuadCore E5430 @ 2.66 and 4Gb of memory. I've got 64 Workstations into > the domain (all workstation have dual-boot WinXP Pro / Ubuntu 9.04), and > about 600 users. > > Any idea ? > > Regards, > Bruno >So, I've recompiled my samba 3.4.2 and ran through valgrind to show what happening : - First thing, through valgrind i don't see any crash with memory dump (see my previous message) during last 2 hours - When i analyse de valgrind logfile produced i see a lot of error messages like these : ==12818== Invalid read of size 1 ==12818== at 0x53C93F0: _nss_files_setnetgrent (in /lib/tls/i686/cmov/libnss_files-2.9.so) ==12818== ... ==12818== ... ==12818== by 0x41975B: run_events (events.c:126) ==12818== Address 0x4f0f887 is 1 bytes before a block of size 120 alloc'd ==12818== at 0x4826FDE: malloc (vg_replace_malloc.c:207) ==12818== by 0x4B92737: getdelim (in /lib/tls/i686/cmov/libc-2.9.so) ==12818== ... ==12818== ... ==12818== by 0x20FE37: switch_message (process.c:1377) ==12818== by 0x2127EC: process_smb (process.c:1408) I don't know if this kind of allocation errors is normal (if any samba developper team read this ^^), but i think it's a clue for the crash of smbd (with the "realloc" errors like in my previous message). see you later for further informations... i return checking memory leaks of samba ^^ Regards, Bruno -- Bruno MACADRE ------------------------------------------------------------------- Ing?nieur Syst?mes et R?seau | Systems and Network Engineer D?partement Informatique | Department of computer science Responsable R?seau et T?l?phonie | Telecom and Network Manager Universit? de Rouen | University of Rouen ------------------------------------------------------------------- Coordonn?es / Contact : Universit? de Rouen Facult? des Sciences et Techniques - Madrillet Avenue de l'Universit? - BP12 76801 St Etienne du Rouvray CEDEX T?l : +33 (0)2-32-95-51-86 Fax : +33 (0)2-32-95-51-87 -------------------------------------------------------------------
Volker Lendecke a ?crit :> On Fri, Oct 02, 2009 at 04:33:39PM +0200, Bruno MACADRE wrote: >> Ok I will turn log to level 10 for the next restart of the server, when >> this crash happening i will send you all the lines requested :) >> >> Another clue, the crash appear only for user who connect under linux (so >> for user who mount their homes and shares in CIFS). Users using WinXP >> Pro Workstations have no problem (while server don't crash of course ^^). > > Oh, that is indeed a very good hint. What cifs version are > you running, and what apps? > > VolkerI've no more information about the crash evoked in my previous message, but last mail of "Ralph Kutschera" (Unknown panic action) points my reflexions about "Why my server seems to be hang when a panic action ??" The answer is pretty simple, i've no "panic action" in my smb.conf, and in the man i can read "Default: panic action =". But if i type the following command : # testparm -v | grep panic I've got : panic action = /bin/sleep 999999999 I don't know why in samba 3.4.2 the new default for panic action is /bin/sleep 999999999 but if i do a simple calcul, this means that my server sleep during 31 years when a crash appears !!!!!!!! 31 years... it's long, so i'm going to activate a REAL panic action in my smb.conf !! This can't resolv neither my ldapsam problem (bug #6771) nor my smbd crash when mounting some CIFS under linux (and even nor my problem of broken compilation over CIFS share), but i think my would be more efficient without sleeping 31 years !! Regards, Bruno. -- Bruno MACADRE ------------------------------------------------------------------- Ing?nieur Syst?mes et R?seau | Systems and Network Engineer D?partement Informatique | Department of computer science Responsable R?seau et T?l?phonie | Telecom and Network Manager Universit? de Rouen | University of Rouen ------------------------------------------------------------------- Coordonn?es / Contact : Universit? de Rouen Facult? des Sciences et Techniques - Madrillet Avenue de l'Universit? - BP12 76801 St Etienne du Rouvray CEDEX T?l : +33 (0)2-32-95-51-86 Fax : +33 (0)2-32-95-51-87 -------------------------------------------------------------------
On Fri, Oct 02, 2009 at 10:43:07AM +0200, Bruno MACADRE wrote:> Hi, > > I don't stop to have problems with samba :s ... > > Now after having workaround the bug of scanning all LDAP users for each > connexion... smbd crash very often. > > In some workstation log files i can see something like this : > > ... > [2009/10/01 16:28:12, 2] smbd/open.c:580(open_file) > baala opened file .profiles/firefox/cookies.sqlite-journal read=No > write=No (numopen=20) > [2009/10/01 16:28:12, 2] smbd/close.c:612(close_normal_file) > baala closed file .profiles/firefox/cookies.sqlite-journal > (numopen=19) NT_STATUS_OK > *** glibc detected *** /usr/sbin/smbd: realloc(): invalid next size: > 0x0955c5c8 *** > ======= Backtrace: ========> /lib/tls/i686/cmov/libc.so.6[0xb7cca604] > /lib/tls/i686/cmov/libc.so.6[0xb7cce1b1] > /lib/tls/i686/cmov/libc.so.6(realloc+0x106)[0xb7cceee6] > /usr/sbin/smbd(Realloc+0x7d)[0x834326d] > /usr/sbin/smbd(brl_lock+0x4a3)[0x82d1f23] > /usr/sbin/smbd(do_lock+0x147)[0x82cc517] > /usr/sbin/smbd[0x8120467] > /usr/sbin/smbd[0x8121e7a] > /usr/sbin/smbd(reply_trans2+0x6ef)[0x8123b5f] > /usr/sbin/smbd[0x8145848] > /usr/sbin/smbd[0x81481ad] > /usr/sbin/smbd[0x8148bd2] > /usr/sbin/smbd(run_events+0x13c)[0x8353cac] > /usr/sbin/smbd(smbd_process+0x791)[0x8147cd1] > /usr/sbin/smbd[0x8623a25] > /usr/sbin/smbd(run_events+0x13c)[0x8353cac] > /usr/sbin/smbd[0x8353f4e] > /usr/sbin/smbd(_tevent_loop_once+0x9b)[0x835458b] > /usr/sbin/smbd(main+0xc12)[0x8624732] > /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe5)[0xb7c71775] > /usr/sbin/smbd[0x80c3e91] > ======= Memory map: =======This is almost certainly bug 6776 which I just committed a fix for. I'm planning a back-port, what specific Samba version do you need the fix for ? Jeremy.