barbara.xxx1975@libero.it
2008-Nov-07 00:11 UTC
R: 6.4-RC2 crashes after a few minutes of uptime
>Ken, > >First of all, I'm not subscribed to the freebsd-stable@ mailinglist,>so please include me in any replies. I have been using a DellPentium>II Inspiron 3700 (192 MiB of RAM, 430 Mhz PII-Celeron CPU) to test>FreeBSD 6.4 and have been noticing panics since upgrading it to 6.4-RC1. >>The system initially was installed with 6.3-RELEASE (single partition>installation, ACPI enabled, GNOME 2) and had been quite stable. So, in >aneffort to help test 6.4, I upgraded it to 6.4-RC1 using freebsd->update(8).The upgrade has destabilized the system, as every single>bootup since thenhas led to a panic. I had coredumps turned on, so I>tried to gather abacktrace with kgdb(1) and every time I'd try it>would hang trying to parsethe core dump. All I see is 4 lines of>"Attempt to extract a compoent of avalue that is not a structure>pointer." and it hangs. > >So, today I decidedto upgrade to 6.4-RC2 in hopes that it would>resolve these problems and italso crashes. Though it does seem to>last longer before panicing. With RC1 Icouldn't even get a full login>to GNOME to finish before the panic, mosttimes. Now, with RC2 it>seems to crash later, but this could just becoincidence. So far I've>just had one crash with RC2, and am now running afew apps in GNOME to>see if I can trigger it again. > >Note also, with RC2I'm unable to analyze the kernel core dump either.>kgdb(1) seems to hangwith the same error as with RC1. I do have>minidump on with RC2, and I'vejust now disabled it so that next time>it crashes, it will save a full corefile, to see if that makes a>difference. Though, if RC1 is any indication, Idoubt that will make>any difference, but we'll see. All I can glean from thepanic (from>the info.0 file) is that the Panic String is a "page fault." Ibelieve>this was the same error with RC1 and as I recall every panic I was>able to see (at the console) it involved the process named "swi6." >Guess Ishould try booting with ACPI disabled to see if there is any>difference,though I never had to do this with 6.3.> >Well, if I can assist with furtherdebugging, let me know.> >Thanks, > >- roryHello, I had a similar problem described on this thread http://lists.freebsd.org/pipermail/freebsd-stable/2008- October/045865.html Summarizing, I had several panics about swi6 (but I had them after some/several hours of uptime). I'm tracking STABLE and generally I'm resyncing /usr/src every 1-2 weeks and I started having problems since about Oct. 5. Unfortunately I had no answer. The problem *seems* gone away after taking the following actions: - disable powerd (I had enabled it few days before the problem emerged) - getting new sources and doing a new buildworld - rebuilding some gnome ports (the "non usual" ones I was using when panics occurred) and all the gnome deamons (sysutils/hal etc.) - as the guy having a similar problem on 7 "solved" moving away from gnome I'm not sure about what has been the resolutive action (and even if I had no more panics, I'm not really sure that the problem could be considered solved!) I hope that it could help about your stability problem... regards Barbara
> >The person who followed up with a list of things that *may* have made >theproblem go away mentioned one of the things was disabling powerd.>Do you havethat enable, and if yes would you mind disabling it to see>if that's theculprit?> >Thanks for the report.Hi, it's the person speaking ;) It seems that I spoke too early. About an hour ago my box hung, but this time it didn't panicked (it isn't since ~Oct. 12). And as confirmed by Rory, it's seems that powerd isn't responsible. The only thing I was able to do has been switching to ttyv0 but after entering my login, it didn't prompted for the password. In the meanwhile, messages similar to the following were popping out: acd0: WARNING - PREVENT_ALLOW taskqueue timeout - completing request directly acd0: WARNING - PREVENT_ALLOW freeing taskqueue zombie request acd0: WARNING - TEST_UNIT_READY taskqueue timeout - completing request directly acd0: WARNING - TEST_UNIT_READY freeing taskqueue zombie request Again, as I've reported in http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 , I was not using acd0 and I never did since the box had been turned on. The box was replying if pinged, but I was unable to access it via ssh, so I had to press the reset button. What happened is similar to what is described here: http: //lists.freebsd.org/pipermail/freebsd-ports/2006-December/037796.html And here http://www.freebsd.org/cgi/query-pr.cgi?pr=110015 I can see another swi6 panic with the same message in the kernel buffer (acd0: WARNING - PREVENT_ALLOW read data overrun 18>0) I had in pr. Isn't my backtrace of any help in tracking down the problem?
>> About kgdb... >> I never used freebsd-update, so sorry if I'm sayingsomething>> stupid, but could it be the case that the kernel has beenbuilt>> without debugging symbols or something like that? Does freebsd- >>update provide a kernel.debug?> >I haven't had to use a the kernel.debug filein the obj dir in a long>time. As far as I know, these days, the GENERICkernel includes debug>symbols. And in cases when there aren't any debugsymbols, that>shouldn't prevent kgdb from loading, I wouldn't think.Hello, I had a k panic some hours ago but I think that's related to a problem with one of my HDs. I've got a dump in /var/crash, and as you were interested, I run: # kgdb /boot/kernel/kernel /var/crash/vmcore.6 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386- marcel-freebsd"...(no debugging symbols found)... Attempt to extract a component of a value that is not a structure pointer. Attempt to extract a component of a value that is not a structure pointer. Attempt to extract a component of a value that is not a structure pointer. Attempt to extract a component of a value that is not a structure pointer. Terminated I had to pkill kgdb as it was in a loop. Running it against kernel.debug in /usr/obj/usr/src/sys/$KERNCONF/ worked as expected. I've always followed this way, so I don't know if it was working with earlier releases. B