bugzilla-daemon at mindrot.org
2005-Sep-13 06:39 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 Summary: Intermittent ssh core dumps Product: Portable OpenSSH Version: 4.2p1 Platform: Sparc OS/Version: Solaris Status: NEW Severity: normal Priority: P2 Component: ssh AssignedTo: bitbucket at mindrot.org ReportedBy: js at phil.uu.nl I get intermirttent core dumps after installing/deploying 4.2p1. 4.1p1 was (and is) still working fine. Here's a backtrace in gdb: (gdb) bt #0 0x45a94 in mkstemp64 () #1 0x801c4 in mkstemp64 () #2 0x80074 in mkstemp64 () #3 0x80f00 in mkstemp64 () #4 0x836f8 in mkstemp64 () #5 0x7ecec in mkstemp64 () #6 0x49070 in mkstemp64 () #7 0x48e34 in mkstemp64 () #8 0x36340 in _init () #9 0x3496c in _init () #10 0x31aa0 in _init () #11 0x31240 in _init () #12 0x1d508 in _init () #13 0x1b310 in _init () #14 0x13f3c in _init () OS is Solaris 7, running on Sparc. OpenSSH was configured as follows: /phil/sw/src/openssh-4.2p1/configure \ --prefix=/phil/sw/sunos/sparc/pkg/openssh-4.2p1 \ --sysconfdir=/phil/etc/openssh \ --without-rsh \ --with-pid-dir=/phil/var/run \ --with-ssl-dir=/phil/sw/sunos/sparc/pkg/openssl-0.9.8 \ --with-cppflags="-I/phil/sw/sunos/sparc/pkg/zlib-1.2.3/include -I/phil/src/tcpwrappers-7.6" \ --with-ldflags="-L/phil/sw/sunos/sparc/pkg/zlib-1.2.3/lib -L/phil/src/tcpwrappers-7.6" \ --with-default-path=/usr/bin:/bin:/phil/sw/sunos/sparc/bin \ --with-tcp-wrappers \ --with-skey=/phil/sw/pkg/skey-1.1.5 \ --with-privsep-user=accessy \ --with-privsep-path=/phil/var/prison ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-13 09:09 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From djm at mindrot.org 2005-09-13 19:09 ------- We might be able to help you if you tell us what you are doing when you get those coredumps. Also rebuild with debugging enabled, as a debugless trace doesn't tell us much. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-13 09:23 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From js at phil.uu.nl 2005-09-13 19:23 ------- (In reply to comment #1)> We might be able to help you if you tell us what you are doing when you get > those coredumps. Also rebuild with debugging enabled, as a debugless trace > doesn't tell us much.What I'm doing is: $ ssh <host> Then: core dump, about one in four tries. I'll rebuild with debugging at earliest convenience. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-19 11:48 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From djm at mindrot.org 2005-09-19 21:48 ------- Also, which compiler (and version) are you using? ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-23 17:35 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From js at phil.uu.nl 2005-09-24 03:35 ------- (In reply to comment #3)> Also, which compiler (and version) are you using?s at goedel:pts/0(9) gcc -v ~ 19:34 Using built-in specs. Target: sparc-sun-solaris2.7 Configured with: /phil/sw/src/gcc-4.0.1/configure --prefix=/phil/sw/sunos/sparc/pkg/gcc-4.0.1 --disable-libgcj --enable-languages=c,c++,objc --with-gnu-as --with-as=/phil/sw/sunos/sparc/bin/as --with-gnu-ld --with-ld=/phil/sw/sunos/sparc/bin/ld --enable-shared Thread model: posix gcc version 4.0.1 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-23 21:55 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From djm at mindrot.org 2005-09-24 07:55 ------- Could you try a different compiler? gcc 4.x appears generate broken code on quite a few platforms, e.g. bug #1080 ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-24 00:08 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From dtucker at zip.com.au 2005-09-24 10:08 ------- And when you do, please make sure you recompile any of the prereqs that were compiled with the newer compiler (esp. openssl but zlib too). You might also want to run openssl's self-test ("make tests") after you build it. People have also reported problems with openssl 0.9.8 but I'm not sure if those were compiler-related or not. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-26 10:38 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From cptsalek at gmail.com 2005-09-26 20:38 ------- Hi there, I have the same problem when connecting with OpenSSH_4.2p1, OpenSSL 0.9.8 05 Jul 2005 The compiler is a gcc version 3.4.3 (csl-sol210-3_4-branch+sol_rpath). I followed this bug and rebuild zlib and libopenssl. Solaris Version is 5.10 Generic_118822-02 sun4u sparc SUNW,Sun-Fire-V240. I executed "make tests" a couple of times, and had a number of Segfaults. The last run produced the following output: run test exit-status.sh ... test remote exit status: proto 1 status 0 test remote exit status: proto 1 status 1 test remote exit status: proto 1 status 4 test remote exit status: proto 1 status 5 test remote exit status: proto 1 status 44 test remote exit status: proto 2 status 0 Write failed: Broken pipe exit code (with sleep) mismatch for protocol 2: 255 != 0 test remote exit status: proto 2 status 1 Segmentation Fault - core dumped exit code mismatch for protocol 2: 139 != 1 Segmentation Fault - core dumped exit code (with sleep) mismatch for protocol 2: 139 != 1 test remote exit status: proto 2 status 4 Write failed: Broken pipe exit code mismatch for protocol 2: 255 != 4 Segmentation Fault - core dumped exit code (with sleep) mismatch for protocol 2: 139 != 4 test remote exit status: proto 2 status 5 test remote exit status: proto 2 status 44 failed remote exit status make[1]: *** [t-exec] Error 1 make[1]: Leaving directory `/opt/gad/sources/openssh-4.2p1/regress' make: *** [tests] Error 2 Running "ssh -vvv" produced the following output: OpenSSH_4.2p1, OpenSSL 0.9.8 05 Jul 2005 debug1: Reading configuration data /etc/ssh/ssh_config debug2: ssh_connect: needpriv 0 debug1: Connecting to gszulg01 [10.64.10.84] port 22. debug1: Connection established. debug1: permanently_set_uid: 0/0 debug1: identity file /.ssh/identity type -1 debug1: identity file /.ssh/id_rsa type -1 debug1: identity file /.ssh/id_dsa type -1 debug1: Remote protocol version 1.99, remote software version OpenSSH_4.1 debug1: match: OpenSSH_4.1 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_4.2 debug2: fd 4 setting O_NONBLOCK debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 debug2: kex_parse_kexinit: ssh-rsa,ssh-dss debug2: kex_parse_kexinit: aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,rijndael-cbc at lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr debug2: kex_parse_kexinit: aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,rijndael-cbc at lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,hmac-ripemd160 at openssh.com,hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,hmac-ripemd160 at openssh.com,hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: none,zlib at openssh.com,zlib debug2: kex_parse_kexinit: none,zlib at openssh.com,zlib debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: first_kex_follows 0 debug2: kex_parse_kexinit: reserved 0 debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 debug2: kex_parse_kexinit: ssh-rsa,ssh-dss debug2: kex_parse_kexinit: aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour,aes192-cbc,aes256-cbc,rijndael-cbc at lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr debug2: kex_parse_kexinit: aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour,aes192-cbc,aes256-cbc,rijndael-cbc at lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,hmac-ripemd160 at openssh.com,hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,hmac-ripemd160 at openssh.com,hmac-sha1-96,hmac-md5-96 debug2: kex_parse_kexinit: none,zlib debug2: kex_parse_kexinit: none,zlib debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: debug2: kex_parse_kexinit: first_kex_follows 0 debug2: kex_parse_kexinit: reserved 0 debug2: mac_init: found hmac-md5 debug1: kex: server->client aes128-cbc hmac-md5 none debug2: mac_init: found hmac-md5 debug1: kex: client->server aes128-cbc hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP Segmentation Fault (core dumped) Backtrace is as follows: # adb core core file = core -- program ``/usr/bin/ssh'' on platform SUNW,Sun-Fire-V240 SIGSEGV: Segmentation Fault $C ffbfee20 bn_sub_words+0x3c(16b850, 16b3e0, 16b400, 7, 1, 9da20) ffbfee90 bn_mul_recursive+0x40c(1, 20, 0, 10, 0, ffffffff) ffbfef10 bn_mul_recursive+0x2e4(1, 40, 0, 20, 0, ffffffff) ffbfef90 bn_mul_recursive+0x2e4(1, 80, 0, 40, 0, ffffffff) ffbff010 BN_mul+0x2c4(159634, 16b530, 15960c, 159820, 2, 1) ffbff088 BN_mod_mul_montgomery+0x3c(0, 1595f8, 15960c, 159858, 159820, 80) ffbff0f8 BN_mod_exp_mont_consttime+0x56c(1595f8, 16b320, 100, d, 159820, 159858) ffbff180 BN_mod_exp_mont+0x70(156308, 1562a8, ffbff2e0, 156288, 159820, 159858) ffbff278 generate_key+0x94(15b7f0, 20, 1562e8, 0, 43, 149360) ffbff308 DH_generate_key+0xc(15b7f0, 1562e8, 20, 0, c3, 0) ffbff378 dh_gen_key+0x7c(15b7f0, 100, 1f, 7e0, ff000, ff) ffbff3e8 kexgex_client+0x174(1586d0, 400, 916c8, 4e2fc, 2000, 1000) ffbff488 kex_input_kexinit+0x5fc(1, 6, 1586d0, 158098, 169c10, 1586e0) ffbff500 dispatch_run+0x94(0, 158714, 1586d0, 156248, 52ddc, 14e400) ffbff578 ssh_kex2+0x17c(163688, 140c00, ffbff764, 15625c, 1, 0) ffbff5e8 ssh_login+0x334(5, ffbff850, 4, 4, 1538b0, 152000) ffbff860 main+0xce8(152064, 161ca0, 151c00, 151800, 153f48, 153400) ffbffb20 _start+0x5c(0, 0, 0, 0, 0, 0) Regards, Christian PS: Sorry for asking, but I searched the documentation, the net and even looked at the configure script, but I didn't find a clue of how to enable debugging during compile time. Did I miss something, and if so, could you advice of how to enable debugging? ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-26 11:10 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From dtucker at zip.com.au 2005-09-26 21:10 ------- (In reply to comment #7)> core file = core -- program ``/usr/bin/ssh'' on platform SUNW,Sun-Fire-V240 > SIGSEGV: Segmentation Fault > $C > ffbfee20 bn_sub_words+0x3c(16b850, 16b3e0, 16b400, 7, 1, 9da20)Looks like a problem with OpenSSL (the trace certainly points there). Did OpenSSL's self-test ("make tests") pass? Does the same problem occur with openssl-0.9.7g? [...]> PS: Sorry for asking, but I searched the documentation, the net and even looked > at the configure script, but I didn't find a clue of how to enable debugging > during compile time. Did I miss something, and if so, could you advice of how > to enable debugging?Debug symbols? Depends on your compiler, but for gcc it's automatically enabled (the "-g" flag). If it's not, then pass the appropriate flag via --with-cflags, eg: ./configure --with-cflags=-g Note that by default, those symbols are stripped out in the installed binaries (ie you should use the compiled files in your build dir for debugging with gdb, adb or similar). ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-26 19:28 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From js at phil.uu.nl 2005-09-27 05:28 ------- (1) Looks like an OpenSSL 0.9.8 issue to me. Does not happen with 0.9.7g. 0.9.8's "make test" was unproblematic, though. (2) Built OpenSSH with the "-g" flag. The core dump showsjs at goedel:pts/5(22) adb /phil/sw/sunos/sparc/obj/openssh-4.2p1/ssh core ~ 21:17 core file = core -- program ``ssh'' on platform SUNW,Ultra-250 SIGSEGV: Segmentation Fault $C bn_sub_words() + 3c [savfp=0xffbeef48,savpc=0x801bc] bn_sub_part_words(13f400,13ba88,13baa8,7,1,ac0b2f1) + 10 [savfp=0xffbeef48,savpc=0x801bc] bn_mul_recursive(20,ffffffff,0,10,0,ffffffff) + 41c [savfp=0xffbeefd8,savpc=0x8006c] bn_mul_recursive(40,ffffffff,13f360,20,0,ffffffff) + 2cc [savfp=0xffbef068,savpc=0x80ef8] BN_mul(13e700,13e6b0,13e6c4,13e628,3,1) + 2b8 [savfp=0xffbef0e0,savpc=0x836f0] BN_mod_mul_montgomery(13e6b0,13e6b0,13e6c4,13e480,13e628,1f) + 30 [savfp=0xffbef150,savpc=0x7ece4] BN_mod_exp_mont_consttime(80,bb,ffbef244,b,13e628,13e480) + 424 [savfp=0xffbef1d8,savpc=0x49068] generate_key(13e5d0,20,12a490,0,130,218ac) + 1c8 [savfp=0xffbef268,savpc=0x48e2c] DH_generate_key(13e5d0,12a490,12a760,0,0,0) + c [savfp=0xffbef2d8,savpc=0x36338] dh_gen_key(13e5d0,80,400,2000,ff1b5eec,ff146618) + 80 [savfp=0xffbef348,savpc=0x34964] kexgex_client(13b8c0,2,0,0,ff1b5eec,400) + 168 [savfp=0xffbef3f0,savpc=0x31a98] kex_input_kexinit(1,6,13b8c0,13b938,13ea20,2) + 45c [savfp=0xffbef468,savpc=0x31238] dispatch_run(0,13b904,13b8c0,0,12a424,ff0000) + 54 [savfp=0xffbef4e0,savpc=0x1d500] ssh_kex2(114c00,125aa0,0,7efefeff,81010100,ff0000) + 124 [savfp=0xffbef550,savpc=0x1b308] ssh_login(126f34,ffbef7b8,4,4,127718,125800) + 30c [savfp=0xffbef7c8,savpc=0x13f34] main(125800,125aa0,122000,126c00,127cd0,125800) + c20 [savfp=0xffbefa78,savpc=0x1245c] ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-27 01:00 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From dtucker at zip.com.au 2005-09-27 11:00 ------- (In reply to comment #9)> (1) Looks like an OpenSSL 0.9.8 issue to me. Does not happen with 0.9.7g. > 0.9.8's "make test" was unproblematic, though.What options did you use when you built openssl-0.9.8? I'm trying to reproduce the problem. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-29 13:03 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 t8m at centrum.cz changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |t8m at centrum.cz ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-29 13:12 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From cptsalek at gmail.com 2005-09-29 23:12 ------- Today I built an OpenSSH 4.2p1 with OpenSSL0.9.7g. Both OpenSSL and OpenSSh passed all tests, there wasn't one SegFault. After this I removed all build directories and compiled OpenSSh with OpenSSL 0.9.8 again, including debugging flags. I don't know why, and I'm not really happy about it, but this time OpenSSH passed all tests and seems to work flawlessly. So, in my point of view, this bug might be closed without solution. I guess that that probably the build environment wasn't sane. But this problem occured on several build results created by at least two people on two different machines (both running Solaris 10, thou). ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
bugzilla-daemon at mindrot.org
2005-Sep-29 13:34 UTC
[Bug 1085] Intermittent ssh core dumps
http://bugzilla.mindrot.org/show_bug.cgi?id=1085 ------- Additional Comments From dtucker at zip.com.au 2005-09-29 23:34 ------- Argh, a Heisenbug! I hate unsolved mysteries too, but I have no idea what else to suggest. There's a similar trace in bug #910 with a segfault at the same place (HP-UX, HP ANSI C compiler). I think it was openssl-0.9.8. I'm going to leave this bug open for a while and see if we can collect any more info. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.