Hi, I am a kgdb newbie, so please be patient. I suspect (just based on the fact that this is the 4th time I edit text files on my NTFS partition through ntfs-3g, using Emacs, and getting frequent I/O error messages inside Emacs, and then a kernel panic) that this is a ntfs-3g related problem. If you ask me exactly how to reproduce it, I sorry, I can tell you exactly (but see the kgdb output below). Anyway, the kernel seems to panic at /usr/src/sys/kern/vfs_bio.c:1530 Just a suggestion for a patch (without knowing the functionality of /usr/src/sys/kern/vfs_bio.c): The line where the kernel panics: /usr/src/sys/kern/vfs_bio.c: ---------------------------------- VM_OBJECT_LOCK(bp->b_bufobj->bo_object); ... ---------------------------------- Comparing to another file, which does error checking before calling VM_OBJECT_LOCK: /usr/src/sys/kern/vfs_aio.c: ---------------------------------- if (vp->v_object != NULL) { VM_OBJECT_LOCK(vp->v_object); ... ---------------------------------- Perhaps the kernel panic could be avoided with the following patch? /usr/src/sys/kern/vfs_bio.c (suggested patch): ---------------------------------- if ((bp->b_bufobj != NULL) && (bp->b_bufobj->bo_object != NULL)) { VM_OBJECT_LOCK(bp->b_bufobj->bo_object); ... ---------------------------------- Please let me know if you need more information. Regards, Johan Kuuse ----------------------------------------------------------------------------------------------------------- kgdb kernel.debug /var/crash/vmcore.1 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x34 fault code = supervisor read, page not present instruction pointer = 0x20:0xc07b6de4 stack pointer = 0x28:0xe79de7c8 frame pointer = 0x28:0xe79de7e8 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1214 (opera) trap number = 12 panic: page fault cpuid = 0 Uptime: 5h20m30s Physical memory: 2035 MB Dumping 218 MB: 203 187 171 155 139 123 107 91 75 59 43 27 11 #0 doadump () at pcpu.h:195 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) list *0xc07b6de4 0xc07b6de4 is in vfs_vmio_release (/usr/src/sys/kern/vfs_bio.c:1530). 1525 vfs_vmio_release(struct buf *bp) 1526 { 1527 int i; 1528 vm_page_t m; 1529 1530 VM_OBJECT_LOCK(bp->b_bufobj->bo_object); 1531 vm_page_lock_queues(); 1532 for (i = 0; i < bp->b_npages; i++) { 1533 m = bp->b_pages[i]; 1534 bp->b_pages[i] = NULL; (kgdb) bt #0 doadump () at pcpu.h:195 #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:563 #3 0xc0a4905c in trap_fatal (frame=0xe79de788, eva=52) at /usr/src/sys/i386/i386/trap.c:899 #4 0xc0a492e0 in trap_pfault (frame=0xe79de788, usermode=0, eva=52) at /usr/src/sys/i386/i386/trap.c:812 #5 0xc0a49c8c in trap (frame=0xe79de788) at /usr/src/sys/i386/i386/trap.c:490 #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 #7 0xc07b6de4 in vfs_vmio_release (bp=0xd927e33c) at /usr/src/sys/kern/vfs_bio.c:1530 #8 0xc07b8a81 in getnewbuf (slpflag=0, slptimeo=0, size=Variable "size" is not available. ) at /usr/src/sys/kern/vfs_bio.c:1847 #9 0xc07ba118 in getblk (vp=0xc8891bb0, blkno=0, size=2048, slpflag=0, slptimeo=0, flags=Variable "flags" is not available. ) at /usr/src/sys/kern/vfs_bio.c:2602 #10 0xc0932815 in ffs_balloc_ufs2 (vp=0xc8891bb0, startoffset=Variable "startoffset" is not available. ) at /usr/src/sys/ufs/ffs/ffs_balloc.c:699 #11 0xc0952a85 in ffs_write (ap=0xe79debc4) at /usr/src/sys/ufs/ffs/ffs_vnops.c:720 #12 0xc0a5efc6 in VOP_WRITE_APV (vop=0xc0b93c60, a=0xe79debc4) at vnode_if.c:691 #13 0xc07dbf37 in vn_write (fp=0xc85f3168, uio=0xe79dec60, active_cred=0xc61c6300, flags=0, td=0xc583fc60) at vnode_if.h:373 #14 0xc07875e7 in dofilewrite (td=0xc583fc60, fd=17, fp=0xc85f3168, auio=0xe79dec60, offset=-1, flags=0) at file.h:254 #15 0xc07878c8 in kern_writev (td=0xc583fc60, fd=17, auio=0xe79dec60) at /usr/src/sys/kern/sys_generic.c:401 #16 0xc078793f in write (td=0xc583fc60, uap=0xe79decfc) at /usr/src/sys/kern/sys_generic.c:317 #17 0xc0a49635 in syscall (frame=0xe79ded38) at /usr/src/sys/i386/i386/trap.c:1035 #18 0xc0a2fc70 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:196 #19 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) -----------------------------------------------------------------------------------------------------------
On Sunday 10 August 2008 10:01:49 pm Johan Kuuse wrote:> Hi, > > I am a kgdb newbie, so please be patient. > I suspect (just based on the fact that this is the 4th time I edit textfiles on my NTFS partition through ntfs-3g, using Emacs, and getting frequent I/O error messages inside Emacs, and then a kernel panic) that this is a ntfs-3g related problem.> If you ask me exactly how to reproduce it, I sorry, I can tell you exactly(but see the kgdb output below).> Anyway, the kernel seems to panic at /usr/src/sys/kern/vfs_bio.c:1530 > > Just a suggestion for a patch (without knowing the functionalityof /usr/src/sys/kern/vfs_bio.c):> > The line where the kernel panics: > /usr/src/sys/kern/vfs_bio.c: > ---------------------------------- > VM_OBJECT_LOCK(bp->b_bufobj->bo_object); > ... > ---------------------------------- > > Comparing to another file, which does error checking before callingVM_OBJECT_LOCK:> /usr/src/sys/kern/vfs_aio.c: > ---------------------------------- > if (vp->v_object != NULL) { > VM_OBJECT_LOCK(vp->v_object); > ... > ---------------------------------- > > Perhaps the kernel panic could be avoided with the following patch? > /usr/src/sys/kern/vfs_bio.c (suggested patch): > ---------------------------------- > if ((bp->b_bufobj != NULL) && (bp->b_bufobj->bo_object != NULL)) { > VM_OBJECT_LOCK(bp->b_bufobj->bo_object); > ... > ---------------------------------- > > Please let me know if you need more information. > > Regards, > Johan Kuuse > > ----------------------------------------------------------------------------------------------------------- > kgdb kernel.debug /var/crash/vmcore.1 > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so:Undefined symbol "ps_pglobal_lookup"]> GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certainconditions.> Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "i386-marcel-freebsd". > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x34 > fault code = supervisor read, page not present > instruction pointer = 0x20:0xc07b6de4 > stack pointer = 0x28:0xe79de7c8 > frame pointer = 0x28:0xe79de7e8 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 1214 (opera) > trap number = 12 > panic: page fault > cpuid = 0 > Uptime: 5h20m30s > Physical memory: 2035 MB > Dumping 218 MB: 203 187 171 155 139 123 107 91 75 59 43 27 11 > > #0 doadump () at pcpu.h:195 > 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); > (kgdb) list *0xc07b6de4 > 0xc07b6de4 is in vfs_vmio_release (/usr/src/sys/kern/vfs_bio.c:1530). > 1525 vfs_vmio_release(struct buf *bp) > 1526 { > 1527 int i; > 1528 vm_page_t m; > 1529 > 1530 VM_OBJECT_LOCK(bp->b_bufobj->bo_object); > 1531 vm_page_lock_queues(); > 1532 for (i = 0; i < bp->b_npages; i++) { > 1533 m = bp->b_pages[i]; > 1534 bp->b_pages[i] = NULL; > (kgdb) bt > #0 doadump () at pcpu.h:195 > #1 0xc0754457 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 > #2 0xc0754719 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:563 > #3 0xc0a4905c in trap_fatal (frame=0xe79de788, eva=52)at /usr/src/sys/i386/i386/trap.c:899> #4 0xc0a492e0 in trap_pfault (frame=0xe79de788, usermode=0, eva=52)at /usr/src/sys/i386/i386/trap.c:812> #5 0xc0a49c8c in trap (frame=0xe79de788)at /usr/src/sys/i386/i386/trap.c:490> #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 > #7 0xc07b6de4 in vfs_vmio_release (bp=0xd927e33c)at /usr/src/sys/kern/vfs_bio.c:1530> #8 0xc07b8a81 in getnewbuf (slpflag=0, slptimeo=0, size=Variable "size" isnot available.> ) at /usr/src/sys/kern/vfs_bio.c:1847 > #9 0xc07ba118 in getblk (vp=0xc8891bb0, blkno=0, size=2048, slpflag=0,slptimeo=0, flags=Variable "flags" is not available.> ) at /usr/src/sys/kern/vfs_bio.c:2602 > #10 0xc0932815 in ffs_balloc_ufs2 (vp=0xc8891bb0,startoffset=Variable "startoffset" is not available.> ) at /usr/src/sys/ufs/ffs/ffs_balloc.c:699 > #11 0xc0952a85 in ffs_write (ap=0xe79debc4)at /usr/src/sys/ufs/ffs/ffs_vnops.c:720> #12 0xc0a5efc6 in VOP_WRITE_APV (vop=0xc0b93c60, a=0xe79debc4) atvnode_if.c:691> #13 0xc07dbf37 in vn_write (fp=0xc85f3168, uio=0xe79dec60,active_cred=0xc61c6300, flags=0, td=0xc583fc60) at vnode_if.h:373> #14 0xc07875e7 in dofilewrite (td=0xc583fc60, fd=17, fp=0xc85f3168,auio=0xe79dec60, offset=-1, flags=0) at file.h:254> #15 0xc07878c8 in kern_writev (td=0xc583fc60, fd=17, auio=0xe79dec60)at /usr/src/sys/kern/sys_generic.c:401> #16 0xc078793f in write (td=0xc583fc60, uap=0xe79decfc)at /usr/src/sys/kern/sys_generic.c:317> #17 0xc0a49635 in syscall (frame=0xe79ded38)at /usr/src/sys/i386/i386/trap.c:1035> #18 0xc0a2fc70 in Xint0x80_syscall ()at /usr/src/sys/i386/i386/exception.s:196> #19 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?)FYI, you got the panic in ffs/ufs, not fuse. I've seen this at work on 6.x with NFS with no clues on what causes it. You can start by going to frame 7 and doing 'p *bp'. -- John Baldwin
> On Tuesday 12 August 2008 02:42:52 am Johan Kuuse wrote: >> On Monday 11 August 2008 23:04:30 John Baldwin wrote: >> > On Sunday 10 August 2008 10:01:49 pm Johan Kuuse wrote: >> > > Hi, >> > > >> > > I am a kgdb newbie, so please be patient. >> > > I suspect (just based on the fact that this is the 4th time I edit text >> > >> > files on my NTFS partition through ntfs-3g, using Emacs, and getting >> > frequent I/O error messages inside Emacs, and then a kernel panic) that >> > this is a ntfs-3g related problem. >> > >> > > If you ask me exactly how to reproduce it, I sorry, I can tell you >> > > exactly >> > >> > (but see the kgdb output below). >> > >> > > Anyway, the kernel seems to panic at /usr/src/sys/kern/vfs_bio.c:1530 >> > > >> > > Just a suggestion for a patch (without knowing the functionality >> > >> > of /usr/src/sys/kern/vfs_bio.c): >> > > The line where the kernel panics: >> > > /usr/src/sys/kern/vfs_bio.c: >> > > ---------------------------------- >> > > VM_OBJECT_LOCK(bp->b_bufobj->bo_object); >> > > ... >> > > ---------------------------------- >> > > >> > > Comparing to another file, which does error checking before calling >> > >> > VM_OBJECT_LOCK: >> > > /usr/src/sys/kern/vfs_aio.c: >> > > ---------------------------------- >> > > if (vp->v_object != NULL) { >> > > VM_OBJECT_LOCK(vp->v_object); >> > > ... >> > > ---------------------------------- >> > > >> > > Perhaps the kernel panic could be avoided with the following patch? >> > > /usr/src/sys/kern/vfs_bio.c (suggested patch): >> > > ---------------------------------- >> > > if ((bp->b_bufobj != NULL) && (bp->b_bufobj->bo_object != NULL)) { >> > > VM_OBJECT_LOCK(bp->b_bufobj->bo_object); >> > > ... >> > > ---------------------------------- >> > > >> > > Please let me know if you need more information. >> > > >> > > Regards, >> > > Johan Kuuse >> > > >> > > ----------------------------------------------------------------------- >> > >------------------------------------ kgdb kernel.debug >> > > /var/crash/vmcore.1 >> > > [GDB will not be able to debug user-mode threads: >> > > /usr/lib/libthread_db.so: >> > >> > Undefined symbol "ps_pglobal_lookup"] >> > >> > > GNU gdb 6.1.1 [FreeBSD] >> > > Copyright 2004 Free Software Foundation, Inc. >> > > GDB is free software, covered by the GNU General Public License, and >> > > you are welcome to change it and/or distribute copies of it under >> > > certain >> > >> > conditions. >> > >> > > Type "show copying" to see the conditions. >> > > There is absolutely no warranty for GDB. Type "show warranty" for >> > > details. This GDB was configured as "i386-marcel-freebsd". >> > > >> > > Unread portion of the kernel message buffer: >> > > >> > > >> > > Fatal trap 12: page fault while in kernel mode >> > > cpuid = 0; apic id = 00 >> > > fault virtual address = 0x34 >> > > fault code = supervisor read, page not present >> > > instruction pointer = 0x20:0xc07b6de4 >> > > stack pointer = 0x28:0xe79de7c8 >> > > frame pointer = 0x28:0xe79de7e8 >> > > code segment = base 0x0, limit 0xfffff, type 0x1b >> > > = DPL 0, pres 1, def32 1, gran 1 >> > > processor eflags = interrupt enabled, resume, IOPL = 0 >> > > current process = 1214 (opera) >> > > trap number = 12 >> > > panic: page fault >> > > cpuid = 0 >> > > Uptime: 5h20m30s >> > > Physical memory: 2035 MB >> > > Dumping 218 MB: 203 187 171 155 139 123 107 91 75 59 43 27 11 >> > > >> > > #0 doadump () at pcpu.h:195 >> > > 195 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); >> > > (kgdb) list *0xc07b6de4 >> > > 0xc07b6de4 is in vfs_vmio_release (/usr/src/sys/kern/vfs_bio.c:1530). >> > > 1525 vfs_vmio_release(struct buf *bp) >> > > 1526 { >> > > 1527 int i; >> > > 1528 vm_page_t m; >> > > 1529 >> > > 1530 VM_OBJECT_LOCK(bp->b_bufobj->bo_object); >> > > 1531 vm_page_lock_queues(); >> > > 1532 for (i = 0; i < bp->b_npages; i++) { >> > > 1533 m = bp->b_pages[i]; >> > > 1534 bp->b_pages[i] = NULL; >> > > (kgdb) bt >> > > #0 doadump () at pcpu.h:195 >> > > #1 0xc0754457 in boot (howto=260) at >> > > /usr/src/sys/kern/kern_shutdown.c:409 #2 0xc0754719 in panic >> > > (fmt=Variable "fmt" is not available. >> > > ) at /usr/src/sys/kern/kern_shutdown.c:563 >> > > #3 0xc0a4905c in trap_fatal (frame=0xe79de788, eva=52) >> > >> > at /usr/src/sys/i386/i386/trap.c:899 >> > >> > > #4 0xc0a492e0 in trap_pfault (frame=0xe79de788, usermode=0, eva=52) >> > >> > at /usr/src/sys/i386/i386/trap.c:812 >> > >> > > #5 0xc0a49c8c in trap (frame=0xe79de788) >> > >> > at /usr/src/sys/i386/i386/trap.c:490 >> > >> > > #6 0xc0a2fc0b in calltrap () at /usr/src/sys/i386/i386/exception.s:139 >> > > #7 0xc07b6de4 in vfs_vmio_release (bp=0xd927e33c) >> > >> > at /usr/src/sys/kern/vfs_bio.c:1530 >> > >> > > #8 0xc07b8a81 in getnewbuf (slpflag=0, slptimeo=0, size=Variable >> > > "size" is >> > >> > not available. >> > >> > > ) at /usr/src/sys/kern/vfs_bio.c:1847 >> > > #9 0xc07ba118 in getblk (vp=0xc8891bb0, blkno=0, size=2048, slpflag=0, >> > >> > slptimeo=0, flags=Variable "flags" is not available. >> > >> > > ) at /usr/src/sys/kern/vfs_bio.c:2602 >> > > #10 0xc0932815 in ffs_balloc_ufs2 (vp=0xc8891bb0, >> > >> > startoffset=Variable "startoffset" is not available. >> > >> > > ) at /usr/src/sys/ufs/ffs/ffs_balloc.c:699 >> > > #11 0xc0952a85 in ffs_write (ap=0xe79debc4) >> > >> > at /usr/src/sys/ufs/ffs/ffs_vnops.c:720 >> > >> > > #12 0xc0a5efc6 in VOP_WRITE_APV (vop=0xc0b93c60, a=0xe79debc4) at >> > >> > vnode_if.c:691 >> > >> > > #13 0xc07dbf37 in vn_write (fp=0xc85f3168, uio=0xe79dec60, >> > >> > active_cred=0xc61c6300, flags=0, td=0xc583fc60) at vnode_if.h:373 >> > >> > > #14 0xc07875e7 in dofilewrite (td=0xc583fc60, fd=17, fp=0xc85f3168, >> > >> > auio=0xe79dec60, offset=-1, flags=0) at file.h:254 >> > >> > > #15 0xc07878c8 in kern_writev (td=0xc583fc60, fd=17, auio=0xe79dec60) >> > >> > at /usr/src/sys/kern/sys_generic.c:401 >> > >> > > #16 0xc078793f in write (td=0xc583fc60, uap=0xe79decfc) >> > >> > at /usr/src/sys/kern/sys_generic.c:317 >> > >> > > #17 0xc0a49635 in syscall (frame=0xe79ded38) >> > >> > at /usr/src/sys/i386/i386/trap.c:1035 >> > >> > > #18 0xc0a2fc70 in Xint0x80_syscall () >> > >> > at /usr/src/sys/i386/i386/exception.s:196 >> > >> > > #19 0x00000033 in ?? () >> > > Previous frame inner to this frame (corrupt stack?) >> > >> > FYI, you got the panic in ffs/ufs, not fuse. I've seen this at work on >> > 6.x with NFS with no clues on what causes it. You can start by going to >> > frame 7 and doing 'p *bp'. >> >> Thanks for the hints. >> See below for more debug output. >> I recognize that the bp struct members b_data and b_kvabase both point to a >> chunk of memory containing the text of the Opera web page I was reading >> when the kernel crashed. (This is indicated above: current process >> = 1214 (opera)) >> >> But what is most interesting is that b_bufobj = 0x0 >> Obviously, then trying to access bp->b_bufobj->bo_object will cause a >> crash. So I think it would be a good idea to NULL-check the struct member >> before trying to access it. How should I proceed? Should I post this as a >> possible bug somewhere else, to another list? > > Unfortunately, it is a worse problem that b_bufobj is NULL. That means there > is a bug elsewhere. I'll look at this some more. > > Hmm, can you reproduce this at all? If so, can you try the patch below. > Hopefully it panics here which might help: > > Index: vfs_subr.c > ==================================================================> --- vfs_subr.c (revision 181629) > +++ vfs_subr.c (working copy) > @@ -1546,6 +1546,9 @@ > CTR3(KTR_BUF, "brelvp(%p) vp %p flags %X", bp, bp->b_vp, bp->b_flags); > KASSERT(bp->b_vp != NULL, ("brelvp: NULL")); > > + if (bp->flags & B_VMIO) > + panic("brelvp of B_VMIO buffer"); > + > /* > * Delete from old vnode list, if on one. > */ > > -- > John Baldwin >Sorry, at the moment I don't know how to reproduce the crash. I mentioned ntfs-ng/fuse as I got the impression that they caused a heavy load on my box, but in the end, it was Opera which caused the crash (also causing a heavy load, however). What I can do is to apply your patch and play around with CPU-consuming apps to try if I can reproduce the crash during heavy load. Currently I'm running 7.-0-RELEASE. Do you recommend me to upgrade to STABLE before applying the patch? Regards, Johan Kuuse