Ken, First of all, I'm not subscribed to the freebsd-stable@ mailing list, so please include me in any replies. I have been using a Dell Pentium II Inspiron 3700 (192 MiB of RAM, 430 Mhz PII-Celeron CPU) to test FreeBSD 6.4 and have been noticing panics since upgrading it to 6.4-RC1. The system initially was installed with 6.3-RELEASE (single partition installation, ACPI enabled, GNOME 2) and had been quite stable. So, in an effort to help test 6.4, I upgraded it to 6.4-RC1 using freebsd- update(8). The upgrade has destabilized the system, as every single bootup since then has led to a panic. I had coredumps turned on, so I tried to gather a backtrace with kgdb(1) and every time I'd try it would hang trying to parse the core dump. All I see is 4 lines of "Attempt to extract a compoent of a value that is not a structure pointer." and it hangs. So, today I decided to upgrade to 6.4-RC2 in hopes that it would resolve these problems and it also crashes. Though it does seem to last longer before panicing. With RC1 I couldn't even get a full login to GNOME to finish before the panic, most times. Now, with RC2 it seems to crash later, but this could just be coincidence. So far I've just had one crash with RC2, and am now running a few apps in GNOME to see if I can trigger it again. Note also, with RC2 I'm unable to analyze the kernel core dump either. kgdb(1) seems to hang with the same error as with RC1. I do have minidump on with RC2, and I've just now disabled it so that next time it crashes, it will save a full core file, to see if that makes a difference. Though, if RC1 is any indication, I doubt that will make any difference, but we'll see. All I can glean from the panic (from the info.0 file) is that the Panic String is a "page fault." I believe this was the same error with RC1 and as I recall every panic I was able to see (at the console) it involved the process named "swi6." Guess I should try booting with ACPI disabled to see if there is any difference, though I never had to do this with 6.3. Well, if I can assist with further debugging, let me know. Thanks, - rory
On Fri, 2008-11-07 at 00:00 -0500, Rory Arms wrote:> Well, if I can assist with further debugging, let me know.The person who followed up with a list of things that *may* have made the problem go away mentioned one of the things was disabling powerd. Do you have that enable, and if yes would you mind disabling it to see if that's the culprit? Thanks for the report. -- Ken Smith - From there to here, from here to | kensmith@cse.buffalo.edu there, funny things are everywhere. | - Theodore Geisel | -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20081107/e8f69105/attachment.pgp
>Ok, guess something is amiss with the CD-ROM drive on this notebook, >asin GNOME, it flashes an icon of a CD on the desktop from time to>time, as ifit has detected a disc in the drive. But of course there>is no disc in thedrive. I believe it did the same with 6.3 though,>but as said before didn'tever panic due to this issue.> >So, some anecdotal info, after running RC2 fora few days now. It>seems the pattern is that it seems to always panic a fewminutes after>a first cold boot, but then seems to remain stable after thesecond>boot. Odd, as with 6.3 this didn't happen. So, I happened to catcha>panic while working in the syscons console after one of these cold>boots. As far as I can tell, the panic does have something to do with >thethe CD-ROM drive, as right after I saw this message on the>console, itimmediately paniced:> >acd0: WARNING - PREVENT_ALLOW read data overrun 18>0 >>and then the panic is as follows: > >kernel trap 12 with interrupts disabled >>Fatal trap 12: page fault while in kernel mode >fault virtual address = 0x78>fault code = supervisor read, page not present >instruction pointer =0x20:0xc06d39b9>stack pointer = 0x28:0xca865c10 >frame pointer = 0x28:0xca865c14>code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0,pres 1, def32 1, gran 1>processor eflags = resume, IOPL = 0 >current process= 19 (swi6: task queue)>trap number = 12 >panic: page fault >Uptime:1h9m7s>Physical memory: 179MB >Dumping 43MB: 28 12 >Dump completeHi Rory, did you see my replies or are you missing them for any reason? Your panics and some aspects about how they happens look like mine to me, look here: http: //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html Unfortunately I got no answer about that and I've had no comment in the pr I've filed http://www.freebsd.org/cgi/query-pr.cgi?pr=128076 I wonder if someone had the time to look at it.
> > Hi Rory, > > > > did you see my replies or are you missing them for any reason? > > Yes, I have seen your replies. I must have missed the PR you mentioned > last time, sorry.No problem!> > Your panics and > > some aspects about how they happens look like mine to me, look here: > > http: > > //lists.freebsd.org/pipermail/freebsd-stable/2008-October/045865.html > > Yes, indeed. That looks very similar to the issue I'm running into > with 6.4-RC2 as well. Sounds like it might be a regression in ata(4). > At least you were able to open the core dump. Are you still able to > open core dumps with RC2? >I'm not sure. I'm running STABLE and I had no panics after the branch has changed to RC2. It seems that my panics are not frequent as yours. Anyway my box freezed a couple of times after last newvers.sh and the symptoms looked like the same, with messages about acd0. I was able to ping it but it won't let me ssh in, like it was using all the cpus. About kgdb... I never used freebsd-update, so sorry if I'm saying something stupid, but could it be the case that the kernel has been built without debugging symbols or something like that? Does freebsd-update provide a kernel.debug? I've seen that you are not using shiny quad-core, but could you try building a kernel by yourself? I think that you could do it using a different, more powerful, freebsd box if you have it, or even on qemu. I could help if you wish.