Hi there,
I'm still working on debugging this myself, but thought that a
few more experienced eyes might be able to help me.
I'm tracking 7-STABLE on my amd64 system, but something
happened a couple of weeks ago that broke lang/drscheme.
I've been doing a bit of regressing and testing, and have
found that a mred executable built against a 7-STABLE
checkout of 2008.01.04.00.00.00 works fine, but the exact
same binary crashes with a SEGV on a kernel check-out of
2008.01.11.00.00.00. It's a bit hard to debug, because the
code in question is a twisty maze of macros, #ifs and inlines,
because it's in the garbage collector, and that's quite
platform-sensitive.
Anyway, the last part of the ktrace of the broken version (the
earlier parts are just loading up shared libraries) looks like:
(I sedded the ^pid out, so that I could get a better look at it
with diff (meld, actually: it's nice)).
mredid mred RET mmap 5496832/0x80053e000
mredid mred CALL
mmap(0,0x200000,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0xffffffff,0)
mredid mred RET mmap 57888768/0x803735000
mredid mred CALL munmap(0x803735000,0xcb000)
mredid mred RET munmap 0
mredid mred CALL munmap(0x803900000,0x35000)
mredid mred RET munmap 0
mredid mred CALL thr_self(0x803801120)
mredid mred RET thr_self 0
mredid mred CALL
mmap(0x7fffffbff000,0x1000,PROT_NONE,MAP_ANON,0xffffffff,0)
mredid mred RET mmap -4198400/0x7fffffbff000
mredid mred CALL thr_set_name(0x1875d,0x802de3800)
mredid mred RET thr_set_name 0
mredid mred CALL rtprio_thread(0,0x1875d,0x7fffffffe110)
mredid mred RET rtprio_thread 0
mredid mred CALL sysarch(0x81,0x7fffffffe130)
mredid mred RET sysarch 0
mredid mred CALL sigprocmask(SIG_SETMASK,0x7fffffffe150,0x7fffffffe140)
mredid mred RET sigprocmask 0
mredid mred CALL sigaction(SIG 32,0x7fffffffe110,0)
mredid mred RET sigaction 0
mredid mred CALL sigprocmask(SIG_SETMASK,0x7fffffffe140,0)
mredid mred RET sigprocmask 0
mredid mred CALL sigprocmask(SIG_BLOCK,0x800633cc0,0x7fffffffe190)
mredid mred RET sigprocmask 0
mredid mred CALL sigprocmask(SIG_SETMASK,0x800633cd0,0)
mredid mred RET sigprocmask 0
mredid mred CALL getrlimit(RLIMIT_DATA,0x7fffffffe3a0)
mredid mred RET getrlimit 0
mredid mred CALL
mmap(0,0x104000,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0xffffffff,0)
mredid mred RET mmap 59768832/0x803900000
mredid mred PSIG SIGSEGV SIG_DFL
mredid mred NAMI "mred.core"
The equivalent section of the version from 4 Jan (that works
fine) looks like:
mredid mred RET mmap 5496832/0x80053e000
mredid mred CALL
mmap(0,0x200000,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0xffffffff,0)
mredid mred RET mmap 57835520/0x803728000
mredid mred CALL munmap(0x803728000,0xd8000)
mredid mred RET munmap 0
mredid mred CALL munmap(0x803900000,0x28000)
mredid mred RET munmap 0
mredid mred CALL thr_self(0x803801120)
mredid mred RET thr_self 0
mredid mred CALL
mmap(0x7fffffbff000,0x1000,PROT_NONE,MAP_ANON,0xffffffff,0)
mredid mred RET mmap -4198400/0x7fffffbff000
mredid mred CALL thr_set_name(0x187ba,0x802dd6800)
mredid mred RET thr_set_name 0
mredid mred CALL rtprio_thread(0,0x187ba,0x7fffffffe0f0)
mredid mred RET rtprio_thread 0
mredid mred CALL sysarch(0x81,0x7fffffffe110)
mredid mred RET sysarch 0
mredid mred CALL sigprocmask(SIG_SETMASK,0x7fffffffe130,0x7fffffffe120)
mredid mred RET sigprocmask 0
mredid mred CALL sigaction(SIG 32,0x7fffffffe0f0,0)
mredid mred RET sigaction 0
mredid mred CALL sigprocmask(SIG_SETMASK,0x7fffffffe120,0)
mredid mred RET sigprocmask 0
mredid mred CALL sigprocmask(SIG_BLOCK,0x800633cc0,0x7fffffffe170)
mredid mred RET sigprocmask 0
mredid mred CALL sigprocmask(SIG_SETMASK,0x800633cd0,0)
mredid mred RET sigprocmask 0
mredid mred CALL getrlimit(RLIMIT_DATA,0x7fffffffe380)
mredid mred RET getrlimit 0
mredid mred CALL
mmap(0,0x104000,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0xffffffff,0)
mredid mred RET mmap 59768832/0x803900000
mredid mred CALL
mmap(0,0x300000,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0xffffffff,0)
mredid mred RET mmap 60833792/0x803a04000
mredid mred CALL munmap(0x803a04000,0xfc000)
mredid mred RET munmap 0
mredid mred CALL munmap(0x803d00000,0x4000)
mredid mred RET munmap 0
mredid mred CALL sigaction(SIGSEGV,0x7fffffffe360,0x7fffffffe340)
mredid mred RET sigaction 0
mredid mred CALL
__sysctl(0x7fffffffd8f0,0x2,0x7fffffffd950,0x7fffffffd8e8,0,0)
mredid mred RET __sysctl 0
mredid mred CALL socket(PF_LOCAL,SOCK_STREAM,0)
mredid mred RET socket 3
mredid mred CALL
__sysctl(0x7fffffffd910,0x2,0x7fffffffd970,0x7fffffffd908,0,0)
mredid mred RET __sysctl 0
mredid mred CALL
__sysctl(0x7fffffffd8b0,0x2,0x7fffffffd8f0,0x7fffffffd8a8,0,0)
mredid mred RET __sysctl 0
mredid mred CALL connect(0x3,0x7fffffffd9f0,0x13)
mredid mred NAMI "/tmp/.X11-unix/X0"
mredid mred RET connect 0
[...and so on...]
So it looks as though it's in a section just before it
establishes it's SIGSEGV handler, and so presumably isn't
expecting to get segv'd just yet. If it hadn't been segv'd,
the next thing to happen was that
mredid mred CALL
mmap(0,0x300000,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,0xffffffff,0)
the faulting instruction is:
0x0000000800bdecc6 <GC_init_type_tags+598>: mov %r13,(%rdx,%rax,8)
r13 is 0x803900000
rdx is -1
rax is 0xe40
Any thoughts on why that would be faulting? (Looks like a write
to a very low address (code?) to me, but I'm not up on the VM
map yet.)
The only thing that looks appropriate that changed in that week
was sys/vm/vm_map.c, which had some new code added to help with
shm mappings. I looked at it, but it didn't look as though it
would affect this.
If anyone has suggestions about what I could
investigate/change/poke next, I'd appreciate it.
Cheers,
--
Andrew