Mikolaj Golub
2009-Sep-25 08:20 UTC
May running megarc still cause memory corruption on 7.X?
Hi, Previously sysutils/megarc port was marked as broken with the statement: running megarc may cause memory corruption/system instability. http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/128082 But recently it has been re-enabled: http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/137938 Gerrit Beine (the maintainer) said that he verified on 7.2 and it worked. But yesterday we had the panic on 7.1-RELEASE-p5 that looked like was caused by megarc with bt identical to reported in ports/128082. Unread portion of the kernel message buffer: TPTE at 0xbfd20830 IS ZERO @ VA 4820c000 panic: bad pte cpuid = 0 Uptime: 10h19m56s Physical memory: 3059 MB Dumping 225 MB: 210 194 178 162 146 130 114 98 82 66 50 34 18 2 (kgdb) backtrace #0 doadump () at pcpu.h:196 #1 0xc07910a7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc0791379 in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc0aa37f6 in pmap_remove_pages (pmap=0xc69ae6e4) at /usr/src/sys/i386/i386/pmap.c:3084 #4 0xc09cf79c in vmspace_exit (td=0xc64f68c0) at /usr/src/sys/vm/vm_map.c:404 #5 0xc076b6ad in exit1 (td=0xc64f68c0, rv=0) at /usr/src/sys/kern/kern_exit.c:305 #6 0xc076ca0d in sys_exit (td=Could not find the frame base for "sys_exit". ) at /usr/src/sys/kern/kern_exit.c:109 #7 0xc0aa81a5 in syscall (frame=0xe8d6ed38) at /usr/src/sys/i386/i386/trap.c:1090 #8 0xc0a8e6e0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #9 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) allpcpu cpuid = 3 curthread = 0xc6ae3d20: pid 48975 "sh" curpcb = 0xe8ea1d90 fpcurthread = none idlethread = 0xc633daf0: pid 11 "idle: cpu3" switchticks = 37193321 cpuid = 2 curthread = 0xc633d8c0: pid 12 "idle: cpu2" curpcb = 0xe4f10d90 fpcurthread = none idlethread = 0xc633d8c0: pid 12 "idle: cpu2" switchticks = 37193374 cpuid = 1 curthread = 0xc633d690: pid 13 "idle: cpu1" curpcb = 0xe4f13d90 fpcurthread = none idlethread = 0xc633d690: pid 13 "idle: cpu1" switchticks = 37193374 cpuid = 0 curthread = 0xc64f68c0: pid 48980 "sh" curpcb = 0xe8d6ed90 fpcurthread = none idlethread = 0xc633d460: pid 14 "idle: cpu0" switchticks = 37193321 (kgdb) ps pid ppid pgrp uid state wmesg wchan cmd 48980 48975 48975 0 RE CPU 0 sh 48978 48976 48976 0 R megarc 48976 48973 48976 0 Ss wait 0xc826e570 sh 48975 48972 48975 0 Rs CPU 3 sh 48973 705 705 0 S piperd 0xc8303318 cron 48972 705 705 0 S piperd 0xc674a18c cron 48267 18141 18141 80 S lockf 0xc83922c0 httpd 48266 18141 18141 80 S lockf 0xc7d62400 httpd 48265 18141 18141 80 S select 0xc0c4ecb8 httpd 48264 18141 18141 80 S lockf 0xc7ceb240 httpd ... At the moment of the crash megarc was run by cron (48973) at the same time other cron job was started (we have the following script set up to run in the same time: if [ -x /usr/local/bin/vnstat ] && [ `ls -l /var/db/vnstat/ | wc -l` -ge 1 ]; then /usr/local/bin/vnstat -u; fi) and this sh process caused panic on its exit when kernel was trying to remove its address space due to corrupted memory. Should I add the comment to ports/137938 about this? I cc to Gerrit. Please note, we are using 7.1-RELEASE-p5 while in ports/137938 it is said that it was checked on 7.2. But it might be that Gerrit just did not test long enough? We had megarc enabled on several 7.1 hosts for some times and saw only this one panic (well, there was another one about a week ago, but it looked hardly related, because megarc was not running at the moment of the crash and the panic was when removing an entry from the namecache, I reported it to hackers@). Below some details from gdb session in case someone is interested to look at this closer. (kgdb) allchains # no output (kgdb) fr 5 #5 0xc076b6ad in exit1 (td=0xc64f68c0, rv=0) at /usr/src/sys/kern/kern_exit.c:305 305 vmspace_exit(td); (kgdb) p *td->td_proc $1 = {p_list = {le_next = 0xc69a2570, le_prev = 0xc0c433f8}, p_threads = {tqh_first = 0xc64f68c0, tqh_last = 0xc64f68c8}, p_upcalls = {tqh_first = 0x0, tqh_last = 0xc6502838}, p_slock = { lock_object = {lo_name = 0xc0b3b5ae "process slock", lo_type = 0xc0b3b5ae "process slock", lo_flags = 720896, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ucred = 0xc708f700, p_fd = 0x0, p_fdtol = 0x0, p_stats = 0xc64f8000, p_limit = 0xc7c60800, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = { tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0xc65028b8, c_flags = 0}, p_sigacts = 0xc7d00000, p_flag = 268443648, p_state = PRS_NORMAL, p_pid = 48980, p_hash = {le_next = 0x0, le_prev = 0xc632ad50}, p_pglist = {le_next = 0x0, le_prev = 0xc709b8a0}, p_pptr = 0xc709b828, p_sibling = {le_next = 0x0, le_prev = 0xc709b8b4}, p_children = { lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc0b3b5a1 "process lock", lo_type = 0xc0b3b5a1 "process lock", lo_flags = 21168128, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ksi = 0xc6655cd0, p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { tqh_first = 0x0, tqh_last = 0xc65028f4}, sq_proc = 0xc6502828, sq_flags = 1}, p_oppid = 0, p_vmspace = 0xc69ae658, p_swtick = 37193315, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = { tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_crux = { rux_runtime = 20485868, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 6784, rux_tu = 6784}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xc66dce04, p_lock = 0 '\0', p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_numupcalls = 0, p_upsleeps = 0, p_completed = 0x0, p_nextupcall = 0, p_upquantum = 0, p_magic = 3203398350, p_osrel = 701000, p_comm = "sh\000n\000er", '\0' <repeats 12 times>, p_pgrp = 0xc839c5c0, p_sysent = 0xc0c0a6e0, p_args = 0xc7c25b00, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc0766af0 <knlist_mtx_lock>, kl_unlock = 0xc07664d0 <knlist_mtx_unlock>, kl_locked = 0xc07664b0 <knlist_mtx_locked>, kl_lockarg = 0xc65028b8}, p_numthreads = 1, p_md = { md_ldt = 0x0}, p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0x0, c_flags = 16}, p_acflag = 1, p_peers = 0x0, p_leader = 0xc6502828, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xc6502ae0, p_ktr = {stqh_first = 0x0, stqh_last = 0xc6502ad0}, p_mqnotifier = { lh_first = 0x0}, p_dtrace = 0x0} (kgdb) p *td $8 = {td_lock = 0xc0c4bcc0, td_proc = 0xc6502828, td_plist = {tqe_next = 0x0, tqe_prev = 0xc6502830}, td_slpq = {tqe_next = 0x0, tqe_prev = 0xc632f040}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xe8ee6a6c}, td_selq = {tqh_first = 0x0, tqh_last = 0xc64f68e0}, td_sleepqueue = 0xc632f040, td_turnstile = 0xc68d6eb0, td_umtxq = 0xc64d8840, td_tid = 100094, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { tqh_first = 0x0, tqh_last = 0xc64f6918}, sq_proc = 0xc6502828, sq_flags = 1}, td_flags = 65542, td_inhibitors = 0, td_pflags = 0, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_lastcpu = 0 '\0', td_oncpu = 0 '\0', td_owepreempt = 0 '\0', td_locks = -2, td_tsqueue = 0 '\0', td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 3, td_mailbox = 0x0, td_ucred = 0xc708f700, td_standin = 0x0, td_upcall = 0x0, td_estcpu = 0, td_slptick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 1768, ru_ixrss = 1512, ru_idrss = 8792, ru_isrss = 1792, ru_minflt = 51, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 2, ru_nivcsw = 1}, td_runtime = 3186278, td_pticks = 13, td_sticks = 14, td_iticks = 0, td_uticks = 0, td_uuticks = 0, td_usticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 3, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_kflags = 0, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = '\0' <repeats 19 times>, td_base_pri = 134 '\206', td_priority = 134 '\206', td_pri_class = 3 '\003', td_user_pri = 144 '\220', td_base_user_pri = 144 '\220', td_pcb = 0xe8d6ed90, td_state = TDS_RUNNING, td_retval = {0, 134598480}, td_slpcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0xda3550f0}}, c_time = 34564372, c_arg = 0xc64f68c0, c_func = 0xc07c2f90 <sleepq_timeout>, c_mtx = 0x0, c_flags = 18}, td_frame = 0xe8d6ed38, td_kstack_obj = 0xc677e554, td_kstack = 3906392064, td_kstack_pages = 2, td_altkstack_obj = 0x0, td_altkstack = 0, td_altkstack_pages = 0, td_critnest = 0, td_md = {md_spinlock_count = 0, md_saved_flags = 70}, td_sched = 0xc64f6abc, td_ar = 0x0, td_syscalls = 75641, td_incruntime = 3186278, td_cpuset = 0xc6331e38, td_fpop = 0x0, td_dtrace = 0x0, td_errno = 0} (kgdb) thr 126 [Switching to thread 126 (Thread 100083)]#0 sched_switch (td=0xc674cd20, newtd=Variable "newtd" is not available. ) at /usr/src/sys/kern/sched_ule.c:1944 1944 cpuid = PCPU_GET(cpuid); (kgdb) backtrace #0 sched_switch (td=0xc674cd20, newtd=Variable "newtd" is not available. ) at /usr/src/sys/kern/sched_ule.c:1944 #1 0xc0799136 in mi_switch (flags=Variable "flags" is not available. ) at /usr/src/sys/kern/kern_synch.c:440 #2 0xc07c284b in sleepq_switch (wchan=Variable "wchan" is not available. ) at /usr/src/sys/kern/subr_sleepqueue.c:497 #3 0xc07c2e96 in sleepq_wait (wchan=0xc6492f28) at /usr/src/sys/kern/subr_sleepqueue.c:580 #4 0xc07995a6 in _sleep (ident=0xc6492f28, lock=0xc647592c, priority=76, wmesg=0xc0b042bb "amrwcmd", timo=0) at /usr/src/sys/kern/kern_synch.c:226 #5 0xc04e8ca4 in amr_wait_command (ac=0xc6492f28) at /usr/src/sys/dev/amr/amr.c:1392 #6 0xc04e9faa in amr_ioctl (dev=0xc645e700, cmd=3224388353, addr=0xc7bb1c40 "\003", flag=1, td=0xc674cd20) at /usr/src/sys/dev/amr/amr.c:914 #7 0xc0755d37 in giant_ioctl (dev=0xc645e700, cmd=3224388353, data=0xc7bb1c40 "\003", fflag=1, td=0xc674cd20) at /usr/src/sys/kern/kern_conf.c:408 #8 0xc071ff47 in devfs_ioctl_f (fp=0xc707063c, com=3224388353, data=0xc7bb1c40, cred=0xc707a200, td=0xc674cd20) at /usr/src/sys/fs/devfs/devfs_vnops.c:595 #9 0xc07c8005 in kern_ioctl (td=0xc674cd20, fd=4, com=3224388353, data=0xc7bb1c40 "\003") at file.h:268 #10 0xc07c8164 in ioctl (td=0xc674cd20, uap=0xe8d33cfc) at /usr/src/sys/kern/sys_generic.c:570 #11 0xc0aa81a5 in syscall (frame=0xe8d33d38) at /usr/src/sys/i386/i386/trap.c:1090 #12 0xc0a8e6e0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #13 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 4 #4 0xc07995a6 in _sleep (ident=0xc6492f28, lock=0xc647592c, priority=76, wmesg=0xc0b042bb "amrwcmd", timo=0) at /usr/src/sys/kern/kern_synch.c:226 226 sleepq_wait(ident); (kgdb) p *lock $2 = {lo_name = 0xc0b04655 "AMR List Lock", lo_type = 0xc0b04655 "AMR List Lock", lo_flags = 16973824, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}} (kgdb) fr 5 #5 0xc04e8ca4 in amr_wait_command (ac=0xc6492f28) at /usr/src/sys/dev/amr/amr.c:1392 1392 error = msleep(ac,&sc->amr_list_lock, PRIBIO, "amrwcmd", 0); (kgdb) p ac $3 = (struct amr_command *) 0xc6492f28 (kgdb) p *ac $4 = {ac_link = {stqe_next = 0x0}, ac_sc = 0xc6475000, ac_slot = 118 'v', ac_status = 0, ac_sg = { sg32 = 0xe6993fe0, sg64 = 0xe6993fe0}, ac_sgbusaddr = 21716960, ac_sg64_lo = 0, ac_sg64_hi = 0, ac_mailbox = {mb_command = 3 '\003', mb_ident = 119 'w', mb_blkcount = 0, mb_lba = 0, mb_physaddr = 21773056, mb_drive = 0 '\0', mb_nsgelem = 0 '\0', res1 = 0 '\0', mb_busy = 0 '\0', mb_nstatus = 0 '\0', mb_status = 0 '\0', mb_completed = '\0' <repeats 45 times>, mb_poll = 0 '\0', mb_ack = 0 '\0', res2 = '\0' <repeats 15 times>}, ac_flags = 71, ac_retries = 0, ac_bio = 0x0, ac_complete = 0, ac_private = 0x0, ac_data = 0xc672ac90, ac_length = 8, ac_dmamap = 0x0, ac_dma64map = 0x0, ac_tag = 0xc647c600, ac_datamap = 0x0, ac_nsegments = 0, ac_mb_physaddr = 27487376, ac_ccb = 0xe699eb00, ac_ccb_busaddr = 21773056} (kgdb) fr 6 #6 0xc04e9faa in amr_ioctl (dev=0xc645e700, cmd=3224388353, addr=0xc7bb1c40 "\003", flag=1, td=0xc674cd20) at /usr/src/sys/dev/amr/amr.c:914 914 error = amr_wait_command(ac); (kgdb) p *td $6 = {td_lock = 0xc0c4bcc0, td_proc = 0xc69a2570, td_plist = {tqe_next = 0x0, tqe_prev = 0xc69a2578}, td_slpq = {tqe_next = 0x0, tqe_prev = 0xc632f3e0}, td_lockq = {tqe_next = 0x0, tqe_prev = 0xe8fe9a6c}, td_selq = {tqh_first = 0x0, tqh_last = 0xc674cd40}, td_sleepqueue = 0xc632f3e0, td_turnstile = 0xc7c95460, td_umtxq = 0xc66a2500, td_tid = 100083, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { tqh_first = 0x0, tqh_last = 0xc674cd78}, sq_proc = 0xc69a2570, sq_flags = 1}, td_flags = 4, td_inhibitors = 0, td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, td_lastcpu = 1 '\001', td_oncpu = 255 '?', td_owepreempt = 0 '\0', td_locks = 6, td_tsqueue = 0 '\0', td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, td_intr_nesting_level = 0, td_pinned = 0, td_mailbox = 0x0, td_ucred = 0xc707a200, td_standin = 0x0, td_upcall = 0x0, td_estcpu = 0, td_slptick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 39, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 9, ru_nivcsw = 1}, td_runtime = 6243427, td_pticks = 0, td_sticks = 0, td_iticks = 0, td_uticks = 0, td_uuticks = 0, td_usticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 10, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_kflags = 0, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = '\0' <repeats 19 times>, td_base_pri = 76 'L', td_priority = 76 'L', td_pri_class = 3 '\003', td_user_pri = 128 '\200', td_base_user_pri = 128 '\200', td_pcb = 0xe8d33d90, td_state = TDS_CAN_RUN, td_retval = {0, 17}, td_slpcallout = {c_links = {sle = { sle_next = 0xc0c3f8d0}, tqe = {tqe_next = 0xc0c3f8d0, tqe_prev = 0xda33fce8}}, c_time = 34422419, c_arg = 0xc674cd20, c_func = 0xc07c2f90 <sleepq_timeout>, c_mtx = 0x0, c_flags = 18}, td_frame = 0xe8d33d38, td_kstack_obj = 0xc68e9c1c, td_kstack = 3906150400, td_kstack_pages = 2, td_altkstack_obj = 0x0, td_altkstack = 0, td_altkstack_pages = 0, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xc674cf1c, td_ar = 0x0, td_syscalls = 99004, td_incruntime = 6243427, td_cpuset = 0xc6331e38, td_fpop = 0xc707063c, td_dtrace = 0x0, td_errno = 0} (kgdb) p *td->td_proc $7 = {p_list = {le_next = 0xc826e570, le_prev = 0xc6502828}, p_threads = {tqh_first = 0xc674cd20, tqh_last = 0xc674cd28}, p_upcalls = {tqh_first = 0x0, tqh_last = 0xc69a2580}, p_slock = { lock_object = {lo_name = 0xc0b3b5ae "process slock", lo_type = 0xc0b3b5ae "process slock", lo_flags = 720896, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ucred = 0xc707a200, p_fd = 0xc7c52d00, p_fdtol = 0x0, p_stats = 0xc674fd00, p_limit = 0xc7c34000, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = { tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0xc69a2600, c_flags = 0}, p_sigacts = 0xc7c76000, p_flag = 268451840, p_state = PRS_NORMAL, p_pid = 48978, p_hash = {le_next = 0x0, le_prev = 0xc632ad48}, p_pglist = {le_next = 0x0, le_prev = 0xc826e5e8}, p_pptr = 0xc826e570, p_sibling = {le_next = 0x0, le_prev = 0xc826e5fc}, p_children = { lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc0b3b5a1 "process lock", lo_type = 0xc0b3b5a1 "process lock", lo_flags = 21168128, lo_witness_data = {lod_list = { stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ksi = 0xc6656a00, p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { tqh_first = 0x0, tqh_last = 0xc69a263c}, sq_proc = 0xc69a2570, sq_flags = 1}, p_oppid = 0, p_vmspace = 0xc6fb7488, p_swtick = 37193314, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = { tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_crux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xc6f368a0, p_lock = 0 '\0', p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_numupcalls = 0, p_upsleeps = 0, p_completed = 0x0, p_nextupcall = 0, p_upquantum = 0, p_magic = 3203398350, p_osrel = 502010, p_comm = "megarc", '\0' <repeats 13 times>, p_pgrp = 0xc839d140, p_sysent = 0xc0c0a6e0, p_args = 0xc7bd6a00, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc0766af0 <knlist_mtx_lock>, kl_unlock = 0xc07664d0 <knlist_mtx_unlock>, kl_locked = 0xc07664b0 <knlist_mtx_locked>, kl_lockarg = 0xc69a2600}, p_numthreads = 1, p_md = {md_ldt = 0x0}, p_itcallout = {c_links = {sle = { sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0x0, c_flags = 16}, p_acflag = 0, p_peers = 0x0, p_leader = 0xc69a2570, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xc69a2828, p_ktr = {stqh_first = 0x0, stqh_last = 0xc69a2818}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0} I am keeping vmcore in case any additional output is needed. -- Mikolaj Golub
David Samms
2009-Sep-25 15:12 UTC
May running megarc still cause memory corruption on 7.X?
Mikolaj Golub wrote:> Hi, > > Previously sysutils/megarc port was marked as broken with the statement: > running megarc may cause memory corruption/system instability. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/128082 > > But recently it has been re-enabled: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/137938 > > Gerrit Beine (the maintainer) said that he verified on 7.2 and it worked. > > But yesterday we had the panic on 7.1-RELEASE-p5 that looked like was caused > by megarc with bt identical to reported in ports/128082. > > Unread portion of the kernel message buffer: > TPTE at 0xbfd20830 IS ZERO @ VA 4820c000 > panic: bad pte > cpuid = 0 > Uptime: 10h19m56s > Physical memory: 3059 MB > Dumping 225 MB: 210 194 178 162 146 130 114 98 82 66 50 34 18 2 > > (kgdb) backtrace > #0 doadump () at pcpu.h:196 > #1 0xc07910a7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 > #2 0xc0791379 in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:574 > #3 0xc0aa37f6 in pmap_remove_pages (pmap=0xc69ae6e4) at /usr/src/sys/i386/i386/pmap.c:3084 > #4 0xc09cf79c in vmspace_exit (td=0xc64f68c0) at /usr/src/sys/vm/vm_map.c:404 > #5 0xc076b6ad in exit1 (td=0xc64f68c0, rv=0) at /usr/src/sys/kern/kern_exit.c:305 > #6 0xc076ca0d in sys_exit (td=Could not find the frame base for "sys_exit". > ) at /usr/src/sys/kern/kern_exit.c:109 > #7 0xc0aa81a5 in syscall (frame=0xe8d6ed38) at /usr/src/sys/i386/i386/trap.c:1090 > #8 0xc0a8e6e0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 > #9 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) allpcpu > cpuid = 3 > curthread = 0xc6ae3d20: pid 48975 "sh" > curpcb = 0xe8ea1d90 > fpcurthread = none > idlethread = 0xc633daf0: pid 11 "idle: cpu3" > switchticks = 37193321 > > cpuid = 2 > curthread = 0xc633d8c0: pid 12 "idle: cpu2" > curpcb = 0xe4f10d90 > fpcurthread = none > idlethread = 0xc633d8c0: pid 12 "idle: cpu2" > switchticks = 37193374 > > cpuid = 1 > curthread = 0xc633d690: pid 13 "idle: cpu1" > curpcb = 0xe4f13d90 > fpcurthread = none > idlethread = 0xc633d690: pid 13 "idle: cpu1" > switchticks = 37193374 > > cpuid = 0 > curthread = 0xc64f68c0: pid 48980 "sh" > curpcb = 0xe8d6ed90 > fpcurthread = none > idlethread = 0xc633d460: pid 14 "idle: cpu0" > switchticks = 37193321 > > (kgdb) ps > pid ppid pgrp uid state wmesg wchan cmd > 48980 48975 48975 0 RE CPU 0 sh > 48978 48976 48976 0 R megarc > 48976 48973 48976 0 Ss wait 0xc826e570 sh > 48975 48972 48975 0 Rs CPU 3 sh > 48973 705 705 0 S piperd 0xc8303318 cron > 48972 705 705 0 S piperd 0xc674a18c cron > 48267 18141 18141 80 S lockf 0xc83922c0 httpd > 48266 18141 18141 80 S lockf 0xc7d62400 httpd > 48265 18141 18141 80 S select 0xc0c4ecb8 httpd > 48264 18141 18141 80 S lockf 0xc7ceb240 httpd > ... > > At the moment of the crash megarc was run by cron (48973) at the same time > other cron job was started (we have the following script set up to run in the > same time: > > if [ -x /usr/local/bin/vnstat ] && [ `ls -l /var/db/vnstat/ | wc -l` -ge 1 ]; then /usr/local/bin/vnstat -u; fi) > > and this sh process caused panic on its exit when kernel was trying to remove > its address space due to corrupted memory. > > Should I add the comment to ports/137938 about this? I cc to Gerrit. Please > note, we are using 7.1-RELEASE-p5 while in ports/137938 it is said that it was > checked on 7.2. But it might be that Gerrit just did not test long enough? We > had megarc enabled on several 7.1 hosts for some times and saw only this one > panic (well, there was another one about a week ago, but it looked hardly > related, because megarc was not running at the moment of the crash and the > panic was when removing an entry from the namecache, I reported it to > hackers@). > > Below some details from gdb session in case someone is interested to look at > this closer. > > (kgdb) allchains > # no output > > (kgdb) fr 5 > #5 0xc076b6ad in exit1 (td=0xc64f68c0, rv=0) at /usr/src/sys/kern/kern_exit.c:305 > 305 vmspace_exit(td); > (kgdb) p *td->td_proc > $1 = {p_list = {le_next = 0xc69a2570, le_prev = 0xc0c433f8}, p_threads = {tqh_first = 0xc64f68c0, > tqh_last = 0xc64f68c8}, p_upcalls = {tqh_first = 0x0, tqh_last = 0xc6502838}, p_slock = { > lock_object = {lo_name = 0xc0b3b5ae "process slock", lo_type = 0xc0b3b5ae "process slock", > lo_flags = 720896, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, > mtx_lock = 4, mtx_recurse = 0}, p_ucred = 0xc708f700, p_fd = 0x0, p_fdtol = 0x0, > p_stats = 0xc64f8000, p_limit = 0xc7c60800, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = { > tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0xc65028b8, > c_flags = 0}, p_sigacts = 0xc7d00000, p_flag = 268443648, p_state = PRS_NORMAL, p_pid = 48980, > p_hash = {le_next = 0x0, le_prev = 0xc632ad50}, p_pglist = {le_next = 0x0, le_prev = 0xc709b8a0}, > p_pptr = 0xc709b828, p_sibling = {le_next = 0x0, le_prev = 0xc709b8b4}, p_children = { > lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc0b3b5a1 "process lock", > lo_type = 0xc0b3b5a1 "process lock", lo_flags = 21168128, lo_witness_data = {lod_list = { > stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ksi = 0xc6655cd0, > p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { > tqh_first = 0x0, tqh_last = 0xc65028f4}, sq_proc = 0xc6502828, sq_flags = 1}, p_oppid = 0, > p_vmspace = 0xc69ae658, p_swtick = 37193315, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, > it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = { > tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, > ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, > ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 0, rux_uticks = 0, > rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_crux = { > rux_runtime = 20485868, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 6784, > rux_tu = 6784}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, > p_tracecred = 0x0, p_textvp = 0xc66dce04, p_lock = 0 '\0', p_sigiolst = {slh_first = 0x0}, > p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', > p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, > p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_numupcalls = 0, > p_upsleeps = 0, p_completed = 0x0, p_nextupcall = 0, p_upquantum = 0, p_magic = 3203398350, > p_osrel = 701000, p_comm = "sh\000n\000er", '\0' <repeats 12 times>, p_pgrp = 0xc839c5c0, > p_sysent = 0xc0c0a6e0, p_args = 0xc7c25b00, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', > p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, > kl_lock = 0xc0766af0 <knlist_mtx_lock>, kl_unlock = 0xc07664d0 <knlist_mtx_unlock>, > kl_locked = 0xc07664b0 <knlist_mtx_locked>, kl_lockarg = 0xc65028b8}, p_numthreads = 1, p_md = { > md_ldt = 0x0}, p_itcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, > tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0x0, c_flags = 16}, > p_acflag = 1, p_peers = 0x0, p_leader = 0xc6502828, p_emuldata = 0x0, p_label = 0x0, > p_sched = 0xc6502ae0, p_ktr = {stqh_first = 0x0, stqh_last = 0xc6502ad0}, p_mqnotifier = { > lh_first = 0x0}, p_dtrace = 0x0} > (kgdb) p *td > $8 = {td_lock = 0xc0c4bcc0, td_proc = 0xc6502828, td_plist = {tqe_next = 0x0, tqe_prev = 0xc6502830}, > td_slpq = {tqe_next = 0x0, tqe_prev = 0xc632f040}, td_lockq = {tqe_next = 0x0, > tqe_prev = 0xe8ee6a6c}, td_selq = {tqh_first = 0x0, tqh_last = 0xc64f68e0}, > td_sleepqueue = 0xc632f040, td_turnstile = 0xc68d6eb0, td_umtxq = 0xc64d8840, td_tid = 100094, > td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { > tqh_first = 0x0, tqh_last = 0xc64f6918}, sq_proc = 0xc6502828, sq_flags = 1}, td_flags = 65542, > td_inhibitors = 0, td_pflags = 0, td_dupfd = 0, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, > td_lastcpu = 0 '\0', td_oncpu = 0 '\0', td_owepreempt = 0 '\0', td_locks = -2, td_tsqueue = 0 '\0', > td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, > td_intr_nesting_level = 0, td_pinned = 3, td_mailbox = 0x0, td_ucred = 0xc708f700, td_standin = 0x0, > td_upcall = 0x0, td_estcpu = 0, td_slptick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, > ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 1768, ru_ixrss = 1512, ru_idrss = 8792, > ru_isrss = 1792, ru_minflt = 51, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, > ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 2, ru_nivcsw = 1}, td_runtime = 3186278, > td_pticks = 13, td_sticks = 14, td_iticks = 0, td_uticks = 0, td_uuticks = 0, td_usticks = 0, > td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, > td_generation = 3, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_kflags = 0, td_xsig = 0, > td_profil_addr = 0, td_profil_ticks = 0, td_name = '\0' <repeats 19 times>, td_base_pri = 134 '\206', > td_priority = 134 '\206', td_pri_class = 3 '\003', td_user_pri = 144 '\220', > td_base_user_pri = 144 '\220', td_pcb = 0xe8d6ed90, td_state = TDS_RUNNING, td_retval = {0, > 134598480}, td_slpcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, > tqe_prev = 0xda3550f0}}, c_time = 34564372, c_arg = 0xc64f68c0, > c_func = 0xc07c2f90 <sleepq_timeout>, c_mtx = 0x0, c_flags = 18}, td_frame = 0xe8d6ed38, > td_kstack_obj = 0xc677e554, td_kstack = 3906392064, td_kstack_pages = 2, td_altkstack_obj = 0x0, > td_altkstack = 0, td_altkstack_pages = 0, td_critnest = 0, td_md = {md_spinlock_count = 0, > md_saved_flags = 70}, td_sched = 0xc64f6abc, td_ar = 0x0, td_syscalls = 75641, > td_incruntime = 3186278, td_cpuset = 0xc6331e38, td_fpop = 0x0, td_dtrace = 0x0, td_errno = 0} > > (kgdb) thr 126 > [Switching to thread 126 (Thread 100083)]#0 sched_switch (td=0xc674cd20, newtd=Variable "newtd" is not available. > ) > at /usr/src/sys/kern/sched_ule.c:1944 > 1944 cpuid = PCPU_GET(cpuid); > (kgdb) backtrace > #0 sched_switch (td=0xc674cd20, newtd=Variable "newtd" is not available. > ) at /usr/src/sys/kern/sched_ule.c:1944 > #1 0xc0799136 in mi_switch (flags=Variable "flags" is not available. > ) at /usr/src/sys/kern/kern_synch.c:440 > #2 0xc07c284b in sleepq_switch (wchan=Variable "wchan" is not available. > ) at /usr/src/sys/kern/subr_sleepqueue.c:497 > #3 0xc07c2e96 in sleepq_wait (wchan=0xc6492f28) at /usr/src/sys/kern/subr_sleepqueue.c:580 > #4 0xc07995a6 in _sleep (ident=0xc6492f28, lock=0xc647592c, priority=76, wmesg=0xc0b042bb "amrwcmd", > timo=0) at /usr/src/sys/kern/kern_synch.c:226 > #5 0xc04e8ca4 in amr_wait_command (ac=0xc6492f28) at /usr/src/sys/dev/amr/amr.c:1392 > #6 0xc04e9faa in amr_ioctl (dev=0xc645e700, cmd=3224388353, addr=0xc7bb1c40 "\003", flag=1, > td=0xc674cd20) at /usr/src/sys/dev/amr/amr.c:914 > #7 0xc0755d37 in giant_ioctl (dev=0xc645e700, cmd=3224388353, data=0xc7bb1c40 "\003", fflag=1, > td=0xc674cd20) at /usr/src/sys/kern/kern_conf.c:408 > #8 0xc071ff47 in devfs_ioctl_f (fp=0xc707063c, com=3224388353, data=0xc7bb1c40, cred=0xc707a200, > td=0xc674cd20) at /usr/src/sys/fs/devfs/devfs_vnops.c:595 > #9 0xc07c8005 in kern_ioctl (td=0xc674cd20, fd=4, com=3224388353, data=0xc7bb1c40 "\003") at file.h:268 > #10 0xc07c8164 in ioctl (td=0xc674cd20, uap=0xe8d33cfc) at /usr/src/sys/kern/sys_generic.c:570 > #11 0xc0aa81a5 in syscall (frame=0xe8d33d38) at /usr/src/sys/i386/i386/trap.c:1090 > #12 0xc0a8e6e0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 > #13 0x00000033 in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) fr 4 > #4 0xc07995a6 in _sleep (ident=0xc6492f28, lock=0xc647592c, priority=76, wmesg=0xc0b042bb "amrwcmd", > timo=0) at /usr/src/sys/kern/kern_synch.c:226 > 226 sleepq_wait(ident); > (kgdb) p *lock > $2 = {lo_name = 0xc0b04655 "AMR List Lock", lo_type = 0xc0b04655 "AMR List Lock", lo_flags = 16973824, > lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}} > > (kgdb) fr 5 > #5 0xc04e8ca4 in amr_wait_command (ac=0xc6492f28) at /usr/src/sys/dev/amr/amr.c:1392 > 1392 error = msleep(ac,&sc->amr_list_lock, PRIBIO, "amrwcmd", 0); > (kgdb) p ac > $3 = (struct amr_command *) 0xc6492f28 > (kgdb) p *ac > $4 = {ac_link = {stqe_next = 0x0}, ac_sc = 0xc6475000, ac_slot = 118 'v', ac_status = 0, ac_sg = { > sg32 = 0xe6993fe0, sg64 = 0xe6993fe0}, ac_sgbusaddr = 21716960, ac_sg64_lo = 0, ac_sg64_hi = 0, > ac_mailbox = {mb_command = 3 '\003', mb_ident = 119 'w', mb_blkcount = 0, mb_lba = 0, > mb_physaddr = 21773056, mb_drive = 0 '\0', mb_nsgelem = 0 '\0', res1 = 0 '\0', mb_busy = 0 '\0', > mb_nstatus = 0 '\0', mb_status = 0 '\0', mb_completed = '\0' <repeats 45 times>, mb_poll = 0 '\0', > mb_ack = 0 '\0', res2 = '\0' <repeats 15 times>}, ac_flags = 71, ac_retries = 0, ac_bio = 0x0, > ac_complete = 0, ac_private = 0x0, ac_data = 0xc672ac90, ac_length = 8, ac_dmamap = 0x0, > ac_dma64map = 0x0, ac_tag = 0xc647c600, ac_datamap = 0x0, ac_nsegments = 0, > ac_mb_physaddr = 27487376, ac_ccb = 0xe699eb00, ac_ccb_busaddr = 21773056} > > (kgdb) fr 6 > #6 0xc04e9faa in amr_ioctl (dev=0xc645e700, cmd=3224388353, addr=0xc7bb1c40 "\003", flag=1, > td=0xc674cd20) at /usr/src/sys/dev/amr/amr.c:914 > 914 error = amr_wait_command(ac); > (kgdb) p *td > $6 = {td_lock = 0xc0c4bcc0, td_proc = 0xc69a2570, td_plist = {tqe_next = 0x0, tqe_prev = 0xc69a2578}, > td_slpq = {tqe_next = 0x0, tqe_prev = 0xc632f3e0}, td_lockq = {tqe_next = 0x0, > tqe_prev = 0xe8fe9a6c}, td_selq = {tqh_first = 0x0, tqh_last = 0xc674cd40}, > td_sleepqueue = 0xc632f3e0, td_turnstile = 0xc7c95460, td_umtxq = 0xc66a2500, td_tid = 100083, > td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { > tqh_first = 0x0, tqh_last = 0xc674cd78}, sq_proc = 0xc69a2570, sq_flags = 1}, td_flags = 4, > td_inhibitors = 0, td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0x0, td_wmesg = 0x0, > td_lastcpu = 1 '\001', td_oncpu = 255 '?', td_owepreempt = 0 '\0', td_locks = 6, td_tsqueue = 0 '\0', > td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0x0, > td_intr_nesting_level = 0, td_pinned = 0, td_mailbox = 0x0, td_ucred = 0xc707a200, td_standin = 0x0, > td_upcall = 0x0, td_estcpu = 0, td_slptick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, > ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, > ru_minflt = 39, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, > ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 9, ru_nivcsw = 1}, td_runtime = 6243427, td_pticks = 0, > td_sticks = 0, td_iticks = 0, td_uticks = 0, td_uuticks = 0, td_usticks = 0, td_intrval = 0, > td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 10, > td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_kflags = 0, td_xsig = 0, td_profil_addr = 0, > td_profil_ticks = 0, td_name = '\0' <repeats 19 times>, td_base_pri = 76 'L', td_priority = 76 'L', > td_pri_class = 3 '\003', td_user_pri = 128 '\200', td_base_user_pri = 128 '\200', > td_pcb = 0xe8d33d90, td_state = TDS_CAN_RUN, td_retval = {0, 17}, td_slpcallout = {c_links = {sle = { > sle_next = 0xc0c3f8d0}, tqe = {tqe_next = 0xc0c3f8d0, tqe_prev = 0xda33fce8}}, > c_time = 34422419, c_arg = 0xc674cd20, c_func = 0xc07c2f90 <sleepq_timeout>, c_mtx = 0x0, > c_flags = 18}, td_frame = 0xe8d33d38, td_kstack_obj = 0xc68e9c1c, td_kstack = 3906150400, > td_kstack_pages = 2, td_altkstack_obj = 0x0, td_altkstack = 0, td_altkstack_pages = 0, > td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xc674cf1c, > td_ar = 0x0, td_syscalls = 99004, td_incruntime = 6243427, td_cpuset = 0xc6331e38, > td_fpop = 0xc707063c, td_dtrace = 0x0, td_errno = 0} > (kgdb) p *td->td_proc > $7 = {p_list = {le_next = 0xc826e570, le_prev = 0xc6502828}, p_threads = {tqh_first = 0xc674cd20, > tqh_last = 0xc674cd28}, p_upcalls = {tqh_first = 0x0, tqh_last = 0xc69a2580}, p_slock = { > lock_object = {lo_name = 0xc0b3b5ae "process slock", lo_type = 0xc0b3b5ae "process slock", > lo_flags = 720896, lo_witness_data = {lod_list = {stqe_next = 0x0}, lod_witness = 0x0}}, > mtx_lock = 4, mtx_recurse = 0}, p_ucred = 0xc707a200, p_fd = 0xc7c52d00, p_fdtol = 0x0, > p_stats = 0xc674fd00, p_limit = 0xc7c34000, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = { > tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_mtx = 0xc69a2600, > c_flags = 0}, p_sigacts = 0xc7c76000, p_flag = 268451840, p_state = PRS_NORMAL, p_pid = 48978, > p_hash = {le_next = 0x0, le_prev = 0xc632ad48}, p_pglist = {le_next = 0x0, le_prev = 0xc826e5e8}, > p_pptr = 0xc826e570, p_sibling = {le_next = 0x0, le_prev = 0xc826e5fc}, p_children = { > lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc0b3b5a1 "process lock", > lo_type = 0xc0b3b5a1 "process lock", lo_flags = 21168128, lo_witness_data = {lod_list = { > stqe_next = 0x0}, lod_witness = 0x0}}, mtx_lock = 4, mtx_recurse = 0}, p_ksi = 0xc6656a00, > p_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = { > tqh_first = 0x0, tqh_last = 0xc69a263c}, sq_proc = 0xc69a2570, sq_flags = 1}, p_oppid = 0, > p_vmspace = 0xc6fb7488, p_swtick = 37193314, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, > it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = { > tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, > ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, > ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = {rux_runtime = 0, rux_uticks = 0, > rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_crux = {rux_runtime = 0, > rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, > p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, > p_textvp = 0xc6f368a0, p_lock = 0 '\0', p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, > p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, > p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, > p_pendingcnt = 0, p_itimers = 0x0, p_numupcalls = 0, p_upsleeps = 0, p_completed = 0x0, > p_nextupcall = 0, p_upquantum = 0, p_magic = 3203398350, p_osrel = 502010, > p_comm = "megarc", '\0' <repeats 13 times>, p_pgrp = 0xc839d140, p_sysent = 0xc0c0a6e0, > p_args = 0xc7bd6a00, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, > p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc0766af0 <knlist_mtx_lock>, > kl_unlock = 0xc07664d0 <knlist_mtx_unlock>, kl_locked = 0xc07664b0 <knlist_mtx_locked>, > kl_lockarg = 0xc69a2600}, p_numthreads = 1, p_md = {md_ldt = 0x0}, p_itcallout = {c_links = {sle = { > sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, > c_mtx = 0x0, c_flags = 16}, p_acflag = 0, p_peers = 0x0, p_leader = 0xc69a2570, p_emuldata = 0x0, > p_label = 0x0, p_sched = 0xc69a2828, p_ktr = {stqh_first = 0x0, stqh_last = 0xc69a2818}, > p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0} > > I am keeping vmcore in case any additional output is needed. >I run 7.2 and after seeing the note that megarc port should work on 7.2 I re-synced the source and installed it on a production server. I ran megarc just once, and the server locked up within 8 hours while under very light load. I was not able to confirm the crash was related to megarc, but since it was the first server crash since 7.2 came out and I strongly suspect megarc.