I'm seeing some extremely odd behavior with solaris. I have a suspicion it's me, but here's the story and maybe someone can suggest an avenue of investigation. This seems to be happening with any release of openssh since at least 2.5.2p1. 1) Problem #1: If SSH protocol 1 is enabled then sshd segfaults right off. This turns out to be because the call to arc4random_stir is corrupting memory and making sensitive_data.server_key non NULL. When key_free is then called on it's UNALLOCATED storage, you get a pretty seg fault. 2) Problem #3: snprintf doesn't like the %.100s specifier. For some reason 00s gets printed, and all the arguments get shifted. This breaks all sorts of things in all sorts of horrible ways. Some basic experimentation seems to indicate that if I take the .100 bit out and just leave %s behind that things will work. This is obviously the wrong fix. Note that this happens regardless of whether BROKEN_SNPRINTF is defined or not. (it isn't by default, but adding it to the top of bsd-snprintf.c and recompiling doesn't seem to help any.) Any ideas? --jeh (Note I'm not subscribed, so please cc: me. Thanks!)
On Fri, 22 Feb 2002, Justin Hahn wrote:> I'm seeing some extremely odd behavior with solaris. I have a suspicion > it's me, but here's the story and maybe someone can suggest an avenue of > investigation. This seems to be happening with any release of openssh > since at least 2.5.2p1. > > 1) Problem #1: If SSH protocol 1 is enabled then sshd segfaults right > off. This turns out to be because the call to arc4random_stir is > corrupting memory and making sensitive_data.server_key non NULL. When > key_free is then called on it's UNALLOCATED storage, you get a pretty > seg fault. >I can't replicate this, nor can I see this in the code. In Solaris 2.5.1 nor 7.> 2) Problem #3: snprintf doesn't like the %.100s specifier. For some > reason 00s gets printed, and all the arguments get shifted. This breaks > all sorts of things in all sorts of horrible ways. Some basic > experimentation seems to indicate that if I take the .100 bit out and > just leave %s behind that things will work. This is obviously the wrong > fix. Note that this happens regardless of whether BROKEN_SNPRINTF is > defined or not. (it isn't by default, but adding it to the top of > bsd-snprintf.c and recompiling doesn't seem to help any.) >I can't replicate this eithe. bsd-snprintf.c was replaced with a version of my choice to help get the NeXTStep port to stablize out. What compiler are you using? I've done compiles using gcc (forgot which version), and an OLD ProC intel. - Ben
> I can't replicate this, nor can I see this in the code. In > Solaris 2.5.1 nor 7.After several hours more poking around it appears it had something to do with openssl. I don't quite know what was causing it, but a rebuild cleared it up. I can no longer reliably reproduce this, but something was VERY VERY wrong with openssl. (I went back to the code tree that produced it and it failed make test with a segfault)> What compiler are you using? I've done compiles using gcc > (forgot which version), and an OLD ProC intel.gcc 2.95.3, gcc 3.0.3 and SunPro 6 update 2 (all solaris...) Anyhow, sorry for the false alarm. It didn't occur to me for some time that it could be openssl. Thanks. --jeh