Eugene Grosbein
2018-Nov-26 12:34 UTC
high cpu irq load and slow boot after update from 10.4 to 11.2
26.11.2018 15:46, Gerrit K?hn wrote:> A couple of weeks ago, I updated an older storage server (2 CPUs, 4 cores > each, 48GB RAM, 36x4GB HDDs, 3 LSI-based mps controllers) from 10.4 to > 11.2. The first thing I noticed was that booting takes much longer now. The > system probes each HDD (there are 36 of them, attached to mps controllers) > very slowly multiple times (I can see the light of each disk blinking, > it takes seconds to go on to the next disk), the whole process takes > several minutes (was much faster before). > > A more nasty issue appears after a couple of weeks of operation (so far, > roughly between 15 and 30 days): > Suddenly there is a very high irq load on one of the CPU cores > (cpu<n>:timer), causing high system load and high cpu load (top easily > shows average load over 10, whereas it was always below 1 before). I cannot > find any process or device as a culprit. First I thought this problem can > only be made to go away by rebooting, but now I managed to get rid of it > (at least for some time, don't know if or when it will be back) while > checking out the latest source in background (I actually intended to fiddle > with some kernel settings, but suddenly the issue was gone after > persisting permanently over the weekend), causing. > > Looking around, I found a couple of vaguely similar reports (like > https://lists.freebsd.org/pipermail/freebsd-current/2017-January/064419.html), > but these all appear to be fixed by now. > I have a couple of other storage machines (mostly mps-based, but always > slightly different hardware) that show no such issue after updating to > 11.2. > > Any ideas?Maybe this box has some clocking problems incompatible with tickless kernel. Try get back to old periodic ticking with sysctl kern.eventtimer.periodic=1 instead of now default 0. Of, if you are curious, run ntpd if it is not already running, wait about an hour then look to its /var/db/ntpd.drift file to see if system clock is good or not. Perhaps, you can get better behaviour changing default value of kern.timecounter.hardware to another one from kern.timecounter.choice; same with kern.eventtimer.timer and kern.eventtimer.choice
Gerrit Kühn
2018-Nov-26 13:14 UTC
high cpu irq load and slow boot after update from 10.4 to 11.2
On Mon, 26 Nov 2018 19:34:43 +0700 Eugene Grosbein <eugen at grosbein.net> wrote about Re: high cpu irq load and slow boot after update from 10.4 to 11.2:> > Any ideas?> Maybe this box has some clocking problems incompatible with tickless > kernel.Is there anything I could look out for in dmesg or similar to spot the root cause for this behaviour? The CPUs are Xeon E5606 on a Supermicro X8DTU mainboard.> Try get back to old periodic ticking with sysctl > kern.eventtimer.periodic=1 instead of now default 0.I'll try that as soon as I spot the issue again.> Of, if you are curious, run ntpd if it is not already running, wait > about an hour then look to its /var/db/ntpd.drift file to see if system > clock is good or not.ntpd is always running. Right now it looks ok to me (but the issue is not there, either). root at storage:~ # cat /var/db/ntpd.drift -1.366> Perhaps, you can get better behaviour changing default value > of kern.timecounter.hardware to another one from kern.timecounter.choice; > same with kern.eventtimer.timer and kern.eventtimer.choiceWould that work while I see the issue (i.e., should it make the issue go away then), or should this be set on (re)boot? Which settings would be recommended to try? This is what I have now: --- root at storage:~ # sysctl kern.timecounter.hardware kern.timecounter.hardware: TSC root at storage:~ # sysctl kern.timecounter.choice kern.timecounter.choice: ACPI-safe(850) HPET(950) i8254(0) TSC(1000) dummy(-1000000) root at storage:~ # sysctl kern.eventtimer.timer kern.eventtimer.timer: LAPIC root at storage:~ # sysctl kern.eventtimer.choice kern.eventtimer.choice: LAPIC(600) HPET(350) HPET1(340) HPET2(340) HPET3(340) i8254(100) RTC(0) --- Thanks for your input. cu Gerrit