We believe ptrace has a problem in 6.3; we have not tried other releases. The same code, however, exists in 7.1. The bug was first encountered in gdb... (gdb) det Detaching from program: /usr/local/bin/emacs, process 66217 (gdb) att 66224 Attaching to program: /usr/local/bin/emacs, process 66224 Error accessing memory address 0x281ba5a4: Device busy. (gdb) det Detaching from program: /usr/local/bin/emacs, process 66224 ptrace: Device busy. (gdb) quit <--- target process 66224 dies here To isolate this problem, a wrote a simple minded test program was written that just attached and detached. This test program found even the very first detach fails with EBUSY (see test source below): $ ./test1 -p 66217 -c 1 -d 10 pid 66217 count 1 delay 10 Start of pass 0 Calling PT_ATTACH pid 66217 addr 0x0 sig 0 Calling PT_DETACH pid 66217 addr 0xffffffff sig 0 Call 0 to PT_DETACH returned -1, errno 16 Once again, the target process died when the ptracing test program exitted, as would be expected if a detach had failed. The failure return was coming from the following test in kern_ptrace() in sys_process.c /* not currently stopped */ if ((p->p_flag & (P_STOPPED_SIG | P_STOPPED_TRACE)) == 0 || p->p_suspcount != p->p_numthreads || (p->p_flag & P_WAITED) == 0) { error = EBUSY; goto fail; } This is applied to all operations except PT_TRACE_ME, PT_ATTACH, and some instances of PT_CLEAR_STEP. P_WAITED is generally not true. In particular, it's not set automatically when a process is PT_ATTACHed. It is cleared by PT_DETACH and again when ptrace sends a signal (PT_CONTINUE, PT_DETACH.) _But_ it's set in only two places, and they aren't in ptrace code. 2 sys/kern/kern_exit.c kern_wait 773 p->p_flag |= P_WAITED; 3 compat/svr4/svr4_misc.c svr4_sys_waitsys 1351 q->p_flag |= P_WAITED; The relevant one is the first one, primarily. Here's the code: mtx_lock_spin(&sched_lock); if ((p->p_flag & P_STOPPED_SIG) && (p->p_suspcount == p->p_numthreads) && (p->p_flag & P_WAITED) == 0 && (p->p_flag & P_TRACED || options & WUNTRACED)) { mtx_unlock_spin(&sched_lock); p->p_flag |= P_WAITED; sx_xunlock(&proctree_lock); td->td_retval[0] = p->p_pid; if (status) *status = W_STOPCODE(p->p_xstat); PROC_UNLOCK(p); return (0); } mtx_unlock_spin(&sched_lock); So it's only set on processes which are already traced. But it's not set until someone calls wait4() on them - or the equivalent sysV compatability routine. Gdb doesn't always wait4() for processes immediately opon tracing them, and the ptrace man page does not imply this is needed. Moreover, it's not clear why it should matter. The process needs to be stopped in order for it to make sense to do most of the things ptrace does. But - why should it need to be waited for? And what kind of sense does this make to someone writing a debugging tool, where the natural logic seems to be: - attach to process - look at some stuff - stick in some kind of breakpoint or similar and start it going again (or 'step' it) - wait for it to stop - look at and modify stuff - detach, or set it moving again By way of experiment, the test for P_WAITED was removed. Gdb no longer had problems, and no new issues with gdb were encountered (although this was just interactive, no "gdb coverage test" was attempted). The test program also stopped having issues. /* not currently stopped */ if ((p->p_flag & (P_STOPPED_SIG | P_STOPPED_TRACE)) == 0 || p->p_suspcount != p->p_numthreads { error = EBUSY; goto fail; } So does anyone know whether it's safe to simply remove that test? Thanks, Arlie Stephens Engineer Dorr H. Clark Advisor Graduate School of Engineering Santa Clara University, Santa Clara, CA --------------------------------------------------------------------- Test program here --------------------------------------------------------------------- /* * experiment with ptrace, try to see which is broken - gdb or ptrace */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <sys/types.h> #include <sys/ptrace.h> void usage(void) { printf("Simple program to play with ptrace\n"); printf("Usage: test1 -p <pid> -c <count> -d <delay (sec)>\n"); printf("Specify -n for no explicit detach\n"); printf("Will attach and detach repeatedly from target process\n"); exit(1); } int main(int argc, char *argv[]) { pid_t pid = -1; int count = 2; int delay = 5; int nodetach = 0; int opt; int ret; int i; while((opt = getopt(argc, argv, ":p:c:d:n")) != -1) { switch(opt) { case 'c': if (sscanf(optarg, "%d", &count) != 1) { printf("Count should be numeric\n"); usage(); } break; case 'd': if (sscanf(optarg, "%d", &delay) != 1) { printf("Delay should be numeric\n"); usage(); } break; case 'n': nodetach = 1; break; case 'p': if (sscanf(optarg, "%d", &pid) != 1) { printf("Pid should be numeric\n"); usage(); } break; default: printf("Illegal option -%c\n", opt); usage(); break; } } printf("pid %d count %d delay %d\n", pid, count, delay); if (pid == -1) { printf("Pid must be specified\n"); usage(); } if (count <= 0) { printf("Count must be positive\n"); usage(); } if (delay < 0) { printf("Delay must not be negative\n"); usage(); } for (i = 0; i < count; i++) { printf("Start of pass %d\n", i); errno = 0; printf("Calling PT_ATTACH pid %d addr 0x%lx sig %d\n", pid, (unsigned long)(caddr_t)NULL, 0); ret = ptrace(PT_ATTACH, pid, NULL, 0); if (ret != 0) { printf("Call %d to PT_ATTACH returned %d, errno %d\n", i, ret, errno); } sleep(delay); if (!nodetach) { errno = 0; printf("Calling PT_DETACH pid %d addr 0x%lx sig %d\n", pid, (unsigned long)(caddr_t)-1, 0); ret = ptrace(PT_DETACH, pid, (caddr_t)-1, 0); if (ret != 0) { printf("Call %d to PT_DETACH returned %d, " "errno %d\n", i, ret, errno); } } sleep(delay); } return 0; }
Dorr, good day. Tue, Oct 27, 2009 at 05:32:34PM -0700, Dorr H. Clark wrote:> We believe ptrace has a problem in 6.3; we have not tried other > releases. The same code, however, exists in 7.1.And in HEAD too.> The bug was first encountered in gdb... > > (gdb) det > Detaching from program: /usr/local/bin/emacs, process 66217 > (gdb) att 66224 > Attaching to program: /usr/local/bin/emacs, process 66224 > Error accessing memory address 0x281ba5a4: Device busy. > (gdb) det > Detaching from program: /usr/local/bin/emacs, process 66224 > ptrace: Device busy. > (gdb) quit <--- target process 66224 dies here > > To isolate this problem, a wrote a simple minded test program was > written that just attached and detached. This test program found > even the very first detach fails with EBUSY (see test source below): > > $ ./test1 -p 66217 -c 1 -d 10 > pid 66217 count 1 delay 10 > Start of pass 0 > Calling PT_ATTACH pid 66217 addr 0x0 sig 0 > Calling PT_DETACH pid 66217 addr 0xffffffff sig 0 > Call 0 to PT_DETACH returned -1, errno 16 > > Once again, the target process died when the ptracing test program > exitted, as would be expected if a detach had failed. > > The failure return was coming from the following test in kern_ptrace() > in sys_process.c > > /* not currently stopped */ > if ((p->p_flag & (P_STOPPED_SIG | P_STOPPED_TRACE)) == 0 || > p->p_suspcount != p->p_numthreads || > (p->p_flag & P_WAITED) == 0) { > error = EBUSY; > goto fail; > }Yes, the ptraced process should have been waited for, even after the PT_ATTACH call. This is somewhat documented in ptrace(2), ----- ----- but I agree that the wording is a bit sloppy. I'll try to produce slightly modified explanation in the manual page and will post the patch here and as the PR. I had modified your example to visually display the results of each wait() call that is made after ptrace() invocation. Here we go: ----- $ ./test -p 45901 pid 45901 count 2 delay 5 Start of pass 0 Calling PT_ATTACH pid 45901 addr 0x0 sig 0 Attached wait() yield 0x117f: stopped by signal 17; <-- after PT_ATTACH wait() yield 0x57f: stopped by signal 5; <-- after PT_STEP Calling PT_DETACH pid 45901 addr 0xffffffffffffffff sig 0 Detached. ----- As you see, the process is stopped just after the PT_ATTACH with the signal 17, SIGSTOP. PT_STEP follows with the delivery of the SIGTRAP. Both of these signals should be processed by the parent's wait(). And PT_DETACH works, apart from one thing: on my 8.0 PT_DETACH leads to the segfault of the traced program. I hadn't yet tried it on the other versions, so may be there is some bug in the code of test.c, or some bug in the ptrace() implementation -- can't say for sure. If anyone knows why the program segfaults -- please, speak up. The modified source of the test.s is attached.> This is applied to all operations except PT_TRACE_ME, PT_ATTACH, and > some instances of PT_CLEAR_STEP. > > P_WAITED is generally not true. In particular, it's not set > automatically when a process is PT_ATTACHed. It is cleared by > PT_DETACH and again when ptrace sends a signal (PT_CONTINUE, > PT_DETACH.) _But_ it's set in only two places, and they aren't in > ptrace code. > > 2 sys/kern/kern_exit.c kern_wait 773 p->p_flag |= P_WAITED; > 3 compat/svr4/svr4_misc.c svr4_sys_waitsys 1351 q->p_flag |= P_WAITED; > > The relevant one is the first one, primarily. Here's the code: > > mtx_lock_spin(&sched_lock); > if ((p->p_flag & P_STOPPED_SIG) && > (p->p_suspcount == p->p_numthreads) && > (p->p_flag & P_WAITED) == 0 && > (p->p_flag & P_TRACED || options & WUNTRACED)) { > mtx_unlock_spin(&sched_lock); > p->p_flag |= P_WAITED; > sx_xunlock(&proctree_lock); > td->td_retval[0] = p->p_pid; > if (status) > *status = W_STOPCODE(p->p_xstat); > PROC_UNLOCK(p); > return (0); > } > mtx_unlock_spin(&sched_lock); > > So it's only set on processes which are already traced. But it's not > set until someone calls wait4() on them - or the equivalent sysV > compatability routine. > > Gdb doesn't always wait4() for processes immediately opon tracing > them, and the ptrace man page does not imply this is needed.Hmm, there is at least one thread on the simular matter, http://sourceware.org/ml/gdb/2008-12/msg00041.html and people are saying that wait() still should be present.> Moreover, it's not clear why it should matter. The process > needs to be stopped in order for it to make sense to do most > of the things ptrace does. But - why should it need to be waited for?To see if it was really stopped, I presume.> And what kind of sense does this make to someone writing a debugging > tool, where the natural logic seems to be: > - attach to process- wait for the process' attachment by doing wait().> - look at some stuff > - stick in some kind of breakpoint or similar and start it going again > (or 'step' it) > - wait for it to stop > - look at and modify stuff > - detach, or set it moving again > > By way of experiment, the test for P_WAITED was removed. Gdb no longer had > problems, and no new issues with gdb were encountered (although this > was just interactive, no "gdb coverage test" was attempted).By the way, I can't reproduce gdb faults with the 8.0 sources. Will try 7.x, but I think that I have no 6.x handy. -- Eygene _ ___ _.--. # \`.|\..----...-'` `-._.-'_.-'` # Remember that it is hard / ' ` , __.--' # to read the on-line manual )/' _/ \ `-_, / # while single-stepping the kernel. `-'" `"\_ ,_.-;_.-\_ ', fsc/as # _.-'_./ {_.' ; / # -- FreeBSD Developers handbook {_.-``-' {_/ #
With FreeBSD 4.x, gdb -k is able to read and interpret the last 4 bytes of a page (4k) boundary. In BSD 6.x/7.x/8.x using the kgdb program, if one issues the kgdb command: (gdb) x /x 0xcbed8ffd An "invalid address" error is returned. However, if one issues the command: (gdb) x /10x 0xcbed8ff0 it is able to read the memory (and past) just fine. The following patch returns the usr/src/lib/libkvm/kvm_i386.c behavior closer to the BSD4.x version and seems to remedy this situation. @@ -289,11 +289,13 @@ #define PG_FRAME4M (~PAGE4M_MASK) pde_pa = ((u_long)pde & PG_FRAME4M) + (va & PAGE4M_MASK); s = _kvm_pa2off(kd, pde_pa, &ofs); +#if 0 if (s < sizeof pde) { _kvm_syserr(kd, kd->program, "_kvm_vatop: pde_pa not found"); goto invalid; } +#endif *pa = ofs; return (NBPDR - (va & PAGE4M_MASK)); } Does anyone see any problem or have any comments about this? Paul Lai Engineer Dorr H. Clark Advisor Graduate School of Engineering Santa Clara University Santa Clara, CA. http://www.cse.scu.edu/~dclark/coen_284_FreeBSD/libkvm_problem.txt
On Wed, 4 Nov 2009 15:06:17 -0800 (PST) "Dorr H. Clark" <dclark@engr.scu.edu> wrote:> > With FreeBSD 4.x, gdb -k is able to read and interpret > the last 4 bytes of a page (4k) boundary. > > In BSD 6.x/7.x/8.x using the kgdb program, > if one issues the kgdb command: > (gdb) x /x 0xcbed8ffd > An "invalid address" error is returned. >This is only 3 bytes - d, e, f. Try it with 0xcbed8ffc. BTW you got a little carried away with cross posting. --- Gary Jennejohn
On Wednesday 04 November 2009 6:06:17 pm Dorr H. Clark wrote:> > With FreeBSD 4.x, gdb -k is able to read and interpret > the last 4 bytes of a page (4k) boundary. > > In BSD 6.x/7.x/8.x using the kgdb program, > if one issues the kgdb command: > (gdb) x /x 0xcbed8ffd > An "invalid address" error is returned. > > However, if one issues the command: > (gdb) x /10x 0xcbed8ff0 > it is able to read the memory (and past) just fine. > > The following patch returns the usr/src/lib/libkvm/kvm_i386.c > behavior closer to the BSD4.x version and seems to remedy this situation. > > @@ -289,11 +289,13 @@ > #define PG_FRAME4M (~PAGE4M_MASK) > pde_pa = ((u_long)pde & PG_FRAME4M) + (va & PAGE4M_MASK); > s = _kvm_pa2off(kd, pde_pa, &ofs); > +#if 0 > if (s < sizeof pde) { > _kvm_syserr(kd, kd->program, > "_kvm_vatop: pde_pa not found"); > goto invalid; > } > +#endif > *pa = ofs; > return (NBPDR - (va & PAGE4M_MASK)); > } > > Does anyone see any problem or have any comments about this?How about this. It needs to fail if the page is not found at all, but this should fix your edge case. It also matches what kvm_amd64.c does. I think this was just a copy and paste bug. Index: kvm_i386.c ==================================================================--- kvm_i386.c (revision 198888) +++ kvm_i386.c (working copy) @@ -295,9 +295,9 @@ #define PG_FRAME4M (~PAGE4M_MASK) pde_pa = ((u_long)pde & PG_FRAME4M) + (va & PAGE4M_MASK); s = _kvm_pa2off(kd, pde_pa, &ofs); - if (s < sizeof pde) { - _kvm_syserr(kd, kd->program, - "_kvm_vatop: pde_pa not found"); + if (s == 0) { + _kvm_err(kd, kd->program, + "_kvm_vatop: 4MB page address not in dump"); goto invalid; } *pa = ofs; @@ -391,9 +391,9 @@ #define PG_FRAME2M (~PAGE2M_MASK) pde_pa = ((u_long)pde & PG_FRAME2M) + (va & PAGE2M_MASK); s = _kvm_pa2off(kd, pde_pa, &ofs); - if (s < sizeof pde) { - _kvm_syserr(kd, kd->program, - "_kvm_vatop_pae: pde_pa not found"); + if (s == 0) { + _kvm_err(kd, kd->program, + "_kvm_vatop: 2MB page address not in dump"); goto invalid; } *pa = ofs; -- John Baldwin
We believe we have identified a significant resource leak present in 6.x, 7.x, and 8.x. We believe this is a regression versus FreeBSD 4.x which appears to do the Right Thing (tm). We have a test program (see below) which will run the system out of sockets by repeated exercise of the failing code path in the kernel. Our proposed fix is applied to the file usr/src/sys/fs/fifofs/fifo_vnops.c @@ -237,6 +237,8 @@ if (ap->a_mode & FWRITE) { if ((ap->a_mode & O_NONBLOCK) && fip->fi_readers == 0) { mtx_unlock(&fifo_mtx); + /* Exclusive VOP lock is held - safe to clean */ + fifo_cleanup(vp); return (ENXIO); } fip->fi_writers++; Test program follows. We welcome feedback on this proposed fix. Chitti Nimmagadda Engineer Dorr H. Clark Advisor Graduate School of Engineering Santa Clara University Santa Clara, CA. A test program which reveals the issue is presented here: #include <stdio.h> #include <stdio.h> #include <sys/types.h> #include <sys/sysctl.h> #include <sys/stat.h> #include <fcntl.h> #include <errno.h> #include <stdlib.h> #include <string.h> #include <signal.h> #define FIFOPATH "/tmp/fifobug.debug" void getsysctl(name, ptr, len) const char *name; void *ptr; size_t len; { size_t nlen = len; if (sysctlbyname(name, ptr, &nlen, NULL, 0) != 0) { perror("sysctl"); printf("name: %s\n", name); exit(-1); } if (nlen != len) { printf("sysctl(%s...) expected %lu, got %lu", name, (unsigned long)len, (unsigned long)nlen); exit(-2); } } main(int argc, char *argv[]) { int acnt = 0, bcnt = 0, maxcnt; int fd; unsigned int maxiter; int notdone = 1; int i= 0; getsysctl("kern.ipc.maxsockets", &maxcnt, sizeof(maxcnt)); if (argc == 2) { maxiter = atoi(argv[1]); } else { maxiter = maxcnt*2; } unlink(FIFOPATH); printf("Max sockets: %d\n", maxcnt); printf("FIFO %s will be created, opened, deleted %d times\n", FIFOPATH, maxiter); getsysctl("kern.ipc.numopensockets", &bcnt, sizeof(bcnt)); while(notdone && (i++ < maxiter)) { if (mkfifo(FIFOPATH, 0) == 0) { chmod(FIFOPATH, S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH); } fd = open(FIFOPATH, O_WRONLY|O_NONBLOCK); if ((fd <= 0) && (errno != ENXIO)) { notdone = 0; } unlink(FIFOPATH); } getsysctl("kern.ipc.numopensockets", &acnt, sizeof(acnt)); printf("Open Sockets: Before Test: %d, After Test: %d, diff: %d\n", bcnt, acnt, acnt - bcnt); if (notdone) { printf("FIFO/socket bug is fixed\n"); exit(0); } else { printf("FIFO/socket bug is NOT fixed\n"); exit(-1); } } http://www.cse.scu.edu/~dclark/coen_284_FreeBSD/fifo_socket_leak.txt