Cal Leeming [Simplicity Media Ltd]
2011-May-30 20:56 UTC
Fwd: Re: Fwd: cgroup OOM killer loop causes system to lockup (possible fix included) - now pinpointed to openssh-server
Just did some testing.. root at vicky:~# cat /var/log/auth.log | grep "Set" May 30 21:41:05 vicky sshd[1568]: Set /proc/self/oom_adj from -17 to -17 May 30 21:41:07 vicky sshd[1574]: Set /proc/self/oom_adj to -17 root at vicky:~# ps faux | grep 1574 root 1574 0.0 0.0 70488 3404 ? Ss 21:41 0:00 \_ sshd: root at pts/1 root at vicky:~# ps faux | grep "1568" root 1568 0.0 0.0 49168 1152 ? Ss 21:41 0:00 /usr/sbin/sshd In sshd.c there seems to be: static int oom_adj_save = INT_MIN; root at courtney:~/openssh-5.5p1# grep -R "Set %s to %d" . ./openbsd-compat/port-linux.c: verbose("Set %s to %d", OOM_ADJ_PATH, oom_adj_save); Then I tried on a server with different network card hardware (as shown below), and got this from the logs: root at courtney:~/openssh-5.5p1# cat /var/log/auth.log | grep "Set" May 30 21:50:15 courtney sshd[4821]: Set /proc/self/oom_adj from 0 to -17 May 30 21:50:26 courtney sshd[4848]: Set /proc/self/oom_adj to 0 root at courtney:~/openssh-5.5p1# ps faux | grep "4848" root 4848 0.0 0.0 70488 3372 ? Ss 21:50 0:00 \_ sshd: root at pts/1 root at courtney:~/openssh-5.5p1# ps faux | grep "4821" root 4821 0.0 0.0 49168 1160 ? Ss 21:50 0:00 /usr/sbin/sshd root at courtney:~/openssh-5.5p1# cat /var/log/auth.log | grep -e "Set" -e "oom_adjust_restore" May 30 21:50:15 courtney sshd[4821]: Set /proc/self/oom_adj from 0 to -17 May 30 21:50:26 courtney sshd[4848]: debug3: oom_adjust_restore May 30 21:50:26 courtney sshd[4848]: Set /proc/self/oom_adj to 0 On 30/05/2011 21:30, Cal Leeming [Simplicity Media Ltd] wrote:> Hi all, > > Please find below a complete transcript of the emails between > debian/kernel-mm mailing lists. > > I've had a response back from someone on the deb mailing list stating: > > ===================================> The bug seems to be that sshd does not reset the OOM adjustment before > running the login shell (or other program). Therefore, please report a > bug against openssh-server. > ===================================> > Therefore, I am submitting this bug to you also.. If someone would be > kind enough to have a flick thru all the below debug/logs, it'd be > very much appreciated. > > Cal
Cal Leeming [Simplicity Media Ltd]
2011-May-30 21:32 UTC
port-linux.c bug with oom_adjust_restore() - causes real bad oom_adj - which can cause DoS conditions.
So I modified the code to try and repair this oom_adj problem... port-linux.c: line 235: //static int oom_adj_save = INT_MIN; line 236: static int oom_adj_save = 0; line 277: verbose("Set %s to %d - sleepycal", OOM_ADJ_PATH, oom_adj_save); I then ran compiled the package, ran SSHd, and yet we still have -17 in oom_adj_save. Wtf? Now, I'm not much of a C coder, but this is weird even in my books... May 30 22:18:19 vicky sshd[12825]: Set /proc/self/oom_adj to -17 - sleepycal So, I went all out crazy, and did the following patch: static int sleepycal_oom_adj_save = 0; verbose("sleepycal_oom_adj_save=%d", sleepycal_oom_adj_save); if (fprintf(fp, "%d\n", sleepycal_oom_adj_save) <= 0) verbose("error writing %s: %s", OOM_ADJ_PATH, strerror(errno)); else verbose("Set %s to %d - sleepycal", OOM_ADJ_PATH, sleepycal_oom_adj_save); And it worked!!! :) May 30 22:27:12 vicky sshd[2532]: sleepycal_oom_adj_save=0 May 30 22:27:12 vicky sshd[2532]: Set /proc/self/oom_adj to 0 - sleepycal root at vicky:~/openssh-5.5p1# cat /proc/2532/oom_adj 0 So, it turns out that it is actually OpenSSH which is broken, after almost 3 days of frustrating digging through millions of lines of code lol. Anyways, would appreciate if someone could get this merged into master (obv rename the vars if you want). Attached is the appropriate patch file as of openssh-5.5p1 Cal On 30/05/2011 21:56, Cal Leeming [Simplicity Media Ltd] wrote:> Just did some testing.. > > root at vicky:~# cat /var/log/auth.log | grep "Set" > May 30 21:41:05 vicky sshd[1568]: Set /proc/self/oom_adj from -17 to -17 > May 30 21:41:07 vicky sshd[1574]: Set /proc/self/oom_adj to -17 > > root at vicky:~# ps faux | grep 1574 > root 1574 0.0 0.0 70488 3404 ? Ss 21:41 0:00 \_ > sshd: root at pts/1 > > root at vicky:~# ps faux | grep "1568" > root 1568 0.0 0.0 49168 1152 ? Ss 21:41 0:00 > /usr/sbin/sshd > > In sshd.c there seems to be: > static int oom_adj_save = INT_MIN; > > root at courtney:~/openssh-5.5p1# grep -R "Set %s to %d" . > ./openbsd-compat/port-linux.c: verbose("Set %s to %d", > OOM_ADJ_PATH, oom_adj_save); > > Then I tried on a server with different network card hardware (as > shown below), and got this from the logs: > > root at courtney:~/openssh-5.5p1# cat /var/log/auth.log | grep "Set" > May 30 21:50:15 courtney sshd[4821]: Set /proc/self/oom_adj from 0 to -17 > May 30 21:50:26 courtney sshd[4848]: Set /proc/self/oom_adj to 0 > > root at courtney:~/openssh-5.5p1# ps faux | grep "4848" > root 4848 0.0 0.0 70488 3372 ? Ss 21:50 0:00 \_ > sshd: root at pts/1 > > root at courtney:~/openssh-5.5p1# ps faux | grep "4821" > root 4821 0.0 0.0 49168 1160 ? Ss 21:50 0:00 > /usr/sbin/sshd > > root at courtney:~/openssh-5.5p1# cat /var/log/auth.log | grep -e "Set" > -e "oom_adjust_restore" > May 30 21:50:15 courtney sshd[4821]: Set /proc/self/oom_adj from 0 to -17 > May 30 21:50:26 courtney sshd[4848]: debug3: oom_adjust_restore > May 30 21:50:26 courtney sshd[4848]: Set /proc/self/oom_adj to 0 > > > > > On 30/05/2011 21:30, Cal Leeming [Simplicity Media Ltd] wrote: >> Hi all, >> >> Please find below a complete transcript of the emails between >> debian/kernel-mm mailing lists. >> >> I've had a response back from someone on the deb mailing list stating: >> >> ===================================>> The bug seems to be that sshd does not reset the OOM adjustment before >> running the login shell (or other program). Therefore, please report a >> bug against openssh-server. >> ===================================>> >> Therefore, I am submitting this bug to you also.. If someone would be >> kind enough to have a flick thru all the below debug/logs, it'd be >> very much appreciated. >> >> Cal >-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: oom_patch_for_openssh-5.5p1_by_sleepycal.patch URL: <http://lists.mindrot.org/pipermail/openssh-unix-dev/attachments/20110530/3a257e54/attachment.ksh>