David Collier-Brown - Sun Canada
2000-Mar-24  13:54 UTC
Samba-2.07pre2 still crashes under Linux and HP-UX
Jason Haar <Jason.Haar@trimble.co.nz> wrote: | Well I've just had a crash last night, and I've copies the log files off. | But I see nothing to suggest what happened. Also I've enabled core-dumps, | and no core dump exits, even though smbd died... | | If you look at the smb.conf file, you'll see I've put in a HUGE sleep | statement to be called via "panic action". Well I can tell you there was no | left-over smbd process!! Cool! That implies it tore down the connection "normally"... [For the readers of samba@samba.org: clients are failing mysteriously on an otherwise stable server. This is repeatable on Linux and HP/UX both] The log starts at [2000/03/23 13:54:11], and at 15:02:38 we get a Gethostbyaddr failure. The sequence was: --- [2000/03/23 15:02:38, 0] lib/util_sock.c:client_name(999) getpeername failed. Error was Bad file descriptor [2000/03/23 15:02:38, 0] lib/util_sock.c:client_name(999) getpeername failed. Error was Bad file descriptor [2000/03/23 15:02:38, 1] lib/util_sock.c:client_name(1007) Gethostbyaddr failed for 155.63.51.194 --- then, a little later, --- [2000/03/23 15:03:44, 1] lib/util_sock.c:client_name(1007) Gethostbyaddr failed for 155.63.51.52 [2000/03/23 15:03:44, 1] lib/util_sock.c:client_name(1007) Gethostbyaddr failed for 155.63.34.14 --- The getpeername failures continue: it repeats many times apparently trying to look up new clients as they connect. I see 110 failures, apparently two per client looked up Do you have about 55 clients? --- Around 19:13:00, 5 hours later, we see the first actual login, starting with an SMBnegprot (negotiate protocol) packet. We see a bunch of connections, each getting an smb process: --- switch message SMBnegprot (pid 21613) switch message SMBnegprot (pid 21837) switch message SMBnegprot (pid 22131) switch message SMBnegprot (pid 22408) switch message SMBnegprot (pid 22567) switch message SMBnegprot (pid 23350) switch message SMBnegprot (pid 23648) switch message SMBnegprot (pid 24090) --- around 8 PM, a "TRANSFER" account logs on, looks at stuff and logs off. --- [2000/03/23 20:39:01, 3] smbd/password.c:register_vuid(277) User name: transfer Real name: Transfer Mechanism Account [2000/03/23 20:39:01, 3] smbd/ipc.c:api_RNetShareEnum(1608) RNetShareEnum gave 8 entries of 8 (1 65504 245 65504) [2000/03/23 20:39:01, 4] rpc_client/cli_netlogon.c:cli_net_sam_logon(361) cli_net_sam_logon: srv:\\NZ01ADC01 mc:CROM clnt 78244259E48CFB34 38d9d825 [2000/03/23 20:39:01, 3 smbd/server.c:exit_server(435) Server exit (normal exit) transfer@crom-crom.trimble.co.nz-Samba.log.old100644 0 0 154037! smb_vwv[7]=0 (0x0) --- This implies that the process seen from Samba is normal: something else is going on, and besides that, we have a problem with getpeername. Anyone here recognize the getpeername problem? --dave [Jason: cna you give us a detailed description of what the person at the PC client sees when the world goes bad? We're going to look at clients and packet logs next...] -- David Collier-Brown in Boston Phone: (781) 442-0734, Room BUR03-3632
David Collier-Brown - Sun Canada wrote:> > Jason Haar <Jason.Haar@trimble.co.nz> wrote: > | Well I've just had a crash last night, and I've copies the log files off. > | But I see nothing to suggest what happened. Also I've enabled core-dumps, > | and no core dump exits, even though smbd died... > | > | If you look at the smb.conf file, you'll see I've put in a HUGE sleep > | statement to be called via "panic action". Well I can tell you there was no > | left-over smbd process!! > > Cool! That implies it tore down the connection "normally"... > > [For the readers of samba@samba.org: clients are failing > mysteriously on an otherwise stable server. This is repeatable > on Linux and HP/UX both] > > The log starts at [2000/03/23 13:54:11], and > at 15:02:38 we get a Gethostbyaddr failure. The sequence was: > --- > [2000/03/23 15:02:38, 0] lib/util_sock.c:client_name(999) > getpeername failed. Error was Bad file descriptor > [2000/03/23 15:02:38, 0] lib/util_sock.c:client_name(999) > getpeername failed. Error was Bad file descriptor > [2000/03/23 15:02:38, 1] lib/util_sock.c:client_name(1007) > Gethostbyaddr failed for 155.63.51.194 > --- > then, a little later, > --- > [2000/03/23 15:03:44, 1] lib/util_sock.c:client_name(1007) > Gethostbyaddr failed for 155.63.51.52 > [2000/03/23 15:03:44, 1] lib/util_sock.c:client_name(1007) > Gethostbyaddr failed for 155.63.34.14Looks like it might be a glibc issue. Are you running out of file descriptors ? What do you have the fd's per process set to ? Jeremy. -- -------------------------------------------------------- Buying an operating system without source is like buying a self-assembly Space Shuttle with no instructions. --------------------------------------------------------
Bartlomiej Solarz-Niesluchowski
2000-Mar-27  07:42 UTC
Samba-2.07pre2 still crashes under Linux and HP-UX
I confirm the problem on Gethostbyaddr on RH LINUX, and HPUX 10.20 and SUNOS 4.1.4 (samba 2.0.6) Best Regards **************************************************************** * Bartlomiej Solarz-Niesluchowski * * Administrator WSISiZ * * Motto - nie psuj Win'9x one i bez tego sie psuja.... * * Jak sobie poscielisz tak sie wyspisz * ****************************************************************