Daniel Dvořák
2006-Sep-01 23:02 UTC
watchdogd_flags followed by panic watchdog timeout, after reboot my rc.conf disappear
Hi all, first of all, I?m sorry maybe for my bad English. We have 2 routers which I maintain in our mesh wireless community network. The Router 1 has 2 atheros adapters, ath0=wistron cm9, ath1=wistron cm10, of course some sisX, fxpX and so on. The Router 2 has 1 atheros adapter, ath0=wistron CM10. My R1 panics and even more it freezes very often. Maybe the reason for panicing and freezing is the same and maybe not. This is not important now, this story is about R2. I started to use "option SW_WATCHDOG" in both my custom kernels on the R1 and R2 recently in hope, it is some walkaround for freezing at least if not for panicing. In the /etc/defaults/rc.conf there are not "watchdogd_flags=""" option, but I tried to wrote it to my /etc/rc.conf in this way: watchdogd_enable="YES" watchdogd_flags="-e ping 10.40.0.72 -s 2 -t 1" I saved my rc.conf without any doubt. I did so, because I wanted to instruct watchdogd to execute my command, common pinging some IP address. I was not satisfied with a trivial file system check instead. After saving the rc.conf file, I restarted watchdogd deamon at once. ... and ... 2 seconds ... my ssh client was disconnected ... unexpected end of ssh session. :) Okay, maybe something wrong, maybe I did a mistake and it panicked. I was waitting for 3 minutes, but R2 did not react at all. So I went to R2 and I powered off and powered on ... but still it was the same. After I attached monitor and keyboard, I saw that ifconfig did not configure any interfaces. Why ? Answear: Because rc.conf had 0 Bytes !!! -rw-r--r-- 1 root wheel 6174 Sep 1 XX:XX rc.conf , I do not remember time of last modification of file. So the content of rc.conf was completly gone !!! Is it possible at all ? Now I am scared that any modification rc.conf will be mean loss of content. I have kernel dump and backtrace of panic. It is in the attachment. If I could help with this, I will do it. And please explain me somebody, how I lost the content of rc.conf file. :-O Thank you. Daniel P.S.: I am not currently subscribed in the freebsd-stable mailling list, so use my e-mail address. I am ok with freebsd-current mailling list. -------------- next part -------------- # cd /usr/obj/usr/src/sys/mykernel/ # kgdb kernel.debug /var/crash/vmcore.0 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Unde fined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: interrupt total irq14: ata0 325735 irq16: fxp1 5 irq17: ath0 50298459 irq18: wi0 3904083 irq19: sis0 fxp0 20167051 cpu0: timer 604044908 Total 678740241 panic: watchdog timeout Uptime: 3d11h53m45s Dumping 223 MB (2 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 223MB (57072 pages) 207 191 175 159 143 127 111 95 79 63 47 31 15 #0 doadump () at pcpu.h:165 165 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) backtrace #0 doadump () at pcpu.h:165 #1 0xc059c4ee in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402 #2 0xc059c7a6 in panic (fmt=0xc081050d "watchdog timeout") at /usr/src/sys/kern/kern_shutdown.c:558 #3 0xc0571642 in watchdog_fire () at /usr/src/sys/kern/kern_clock.c:583 #4 0xc0571130 in hardclock (frame=0xc1f44780) at /usr/src/sys/kern/kern_clock.c:279 #5 0xc07a4631 in lapic_handle_timer (frame {cf_vec = 0, cf_fs = 8, cf_es = 40, cf_ds = 40, cf_edi = -1040320488, cf_esi = -1040320512, cf_ebp = -890192676, cf_ebx = 0, cf_edx = 0, cf_ecx = -1041016416, cf_eax = 1000, cf_eip = -1063283195, cf_cs = 32, cf_eflags = 524818, cf_esp = -890192644, cf_ss = -1063305969}) at /usr/src/sys/i386/i386/local_apic.c:623 #6 0xc079eb30 in Xtimerint () at apic_vector.s:137 #7 0xc09f9605 in ?? () #8 0xcaf0bd04 in ?? () #9 0xc07a609f in cpu_idle () at /usr/src/sys/i386/i386/machdep.c:1134 Previous frame inner to this frame (corrupt stack?) (kgdb) quit
Stefan Bethke
2006-Sep-04 09:16 UTC
watchdogd_flags followed by panic watchdog timeout, after reboot my rc.conf disappear
[ Please do not crosspost. ] Am 02.09.2006 um 01:01 schrieb Daniel Dvo??k:> In the /etc/defaults/rc.conf there are not "watchdogd_flags=""" > option, but > I tried to wrote it to my /etc/rc.conf in this way: > > watchdogd_enable="YES" > watchdogd_flags="-e ping 10.40.0.72 -s 2 -t 1"You probably would have wanted "-e 'ping 10.40.0.72 -s2 -t1'". Without the single quotes, the command is just ping, which will exit with 64 (EX_USAGE), so the command never completes successfully, and the kernel watchdog timer is never reset. Hence the watchdog timeout. It's a bug in watchdogd that it does not complain about the extra arguments.> I saved my rc.conf without any doubt. > > I did so, because I wanted to instruct watchdogd to execute my > command, > common pinging some IP address. I was not satisfied with a trivial > file > system check instead. > > After saving the rc.conf file, I restarted watchdogd deamon at once. > > ... and ... 2 seconds ... my ssh client was disconnected ... > unexpected end > of ssh session. :)Most likely, the rc.conf changes had not been committed to disk when the watchdog timeout occurred, so they got lost. The watchdog facility is meant to recover the machine from serious problems (like deadlocks, livelocks, or similar). As such, it will not do a proper shutdown, since the machine is probably in a state where the shutdown would also hang. It's a last-ditch effort to get the machine to be responsible again, even if there might be damage due to the sudden panic/reboot. If you want to reboot your router when network connectivity is problematic, I'd set up a cron job to run ping and invoke shutdown -r if it fails instead. Stefan -- Stefan Bethke <stb@lassitu.de> Fon +49 170 346 0140
Dmitry Pryanishnikov
2006-Sep-22 04:15 UTC
watchdogd_flags followed by panic watchdog timeout, after reboot my rc.conf disappear
Hello! On Sat, 2 Sep 2006, Daniel Dvo??k wrote:> I saved my rc.conf without any doubt.I believe you, really ;)> Answear: Because rc.conf had 0 Bytes !!! > > -rw-r--r-- 1 root wheel 6174 Sep 1 XX:XX rc.conf , I do not remember > time of last modification of file. > > So the content of rc.conf was completly gone !!!Yes, because by default "/" is mounted in the following fashion: noasync Metadata I/O should be done synchronously, while data I/O should be done asynchronously. This is the default. -----------------------------------------------------^^^^^^^^^^^^^^^^^^^^ So yes, /etc/rc.conf will become empty if you're just edited it, and then, e.g., power disappears. It's a dangerous situation, because box becomes unreachable via network. To guard against it, you can just mount "/" using synchronous mode: sync All I/O to the file system should be done synchronously. I've just modified my test machine's configuration in this way: /dev/ad0s3a / ufs rw,sync 1 1 and done several times "edit /etc/rc.conf" -> "power off/on" sequence (no RESET key on box). The rc.conf is intact (while w/o "sync" it became empty after my second attempt). Note that this will further decrease FS performance for "/" (I always follow old good RELENG_4 advise NOT to turn softupdates on for "/" also). That's why /tmp and /var are separate partiotions (or just symlinks to SU-enabled /usr) in my typical setup.> And please explain me somebody, how I lost the content of rc.conf file. :-OI hope I've just managed to do that ;)> P.S.: I am not currently subscribed in the freebsd-stable mailling list, so > use my e-mail address. I am ok with freebsd-current mailling list.I think my recipe would be more useful in -stable list (which IMHO is "a must" for reading by the production machines admins), that's why I'm sending to the -stable also. Sincerely, Dmitry -- Atlantis ISP, System Administrator e-mail: dmitry@atlantis.dp.ua nic-hdl: LYNX-RIPE