Joseph Tremoulet via llvm-dev
2020-Mar-04 20:16 UTC
[llvm-dev] Continuing from dbgtrap on different targets
Hi, I'm noticing an unexpected difference between targets when I hit a dbgtrap in the debugger. Consider this simple llvm function: define void @do_break() { entry: call void @llvm.debugtrap() ret void } If I compile that with llc and use lldb to launch a program that calls it, on x86_64 linux (Ubuntu 18.04), here's what I see at the stop: Process 130404 stopped * thread #1, name = 'doit', stop reason = signal SIGTRAP frame #0: 0x0000000000400541 doit`do_break at stub.ll:2:1 (lldb) disas doit`do_break: 0x400540 <+0>: int3 -> 0x400541 <+1>: retq (lldb) register read rip rip = 0x0000000000400541 doit`do_break + 1 at stub.ll:2:1 Note that rip is reported as pointing to the next instruction after the int3. If I 'continue' from there, the program continues doing whatever was after the debugtrap. If I follow the same steps on aarch64 (also Ubuntu 18.04), I see this: Process 21586 stopped * thread #1, name = 'doit', stop reason = signal SIGTRAP frame #0: 0x00000000004005dc doit`do_break at stub.ll:1:1 (lldb) disas doit`do_break: -> 0x4005dc <+0>: brk #0x1 0x4005e0 <+4>: ret (lldb) register read pc pc = 0x00000000004005dc doit`do_break at stub.ll:1:1 Note that here, pc is reported as pointing at the 'brk' instruction itself. If I 'continue' from there, I immediately find myself stopped back at the same point, ad infinitum.>From what I can tell, GDB also "gets stuck" when it hits this instruction on aarch64 (and also doesn't on x86_64).I'm wondering what to make of this / where's the "bug". * Should llvm use a different lowering for dbgtrap on aarch64-linux? I don't think so, it seems to be standard * Should the system signal handler be reporting an incremented pc in the context struct when it hits brk? * And even if so, what should the workaround be for systems without such a fix? * Should lldb (and gdb for that matter) somehow recognize this case, and increment pc when stopping or resuming at a brk? * Is this just unsupported, is continuing past a debugtrap UB or otherwise disallowed? I'd appreciate any insights here. Thanks, -Joseph -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200304/2bb39dce/attachment.html>
Jim Ingham via llvm-dev
2020-Mar-04 20:45 UTC
[llvm-dev] [lldb-dev] Continuing from dbgtrap on different targets
As you have seen, different machine architectures do different things after hitting a trap. On x86_64, the trap instruction is executed, and then you stop, so the PC is left after the stop. On arm64, when execution halts the pc is still pointing at the trap instruction. I don't think lldb should be in the business of telling systems how they should report stops, especially since that is certainly something we can handle in lldb. For traps that lldb recognizes as ones it is using for breakpoints, it already has to handle this difference for you. But for traps we know nothing about we don't do anything special. I think it would be entirely reasonable that whenever lldb encounters a trap instruction that isn't one of ours it should always move the PC after the trap before returning control to the user. I can't see why you would want to keep hitting the trap over and over. I've received several bugs (on the Apple bug reporter side) for this feature. This might be something we teach lldb-server & debugserver to do, rather than lldb but that's an implementation detail... For now, on architectures where the trap doesn't execute, you just need to move the pc past the trap by hand (with the "thread jump" command) before continuing. That has always been safe on arm64 so far as I can tell. Jim> On Mar 4, 2020, at 12:16 PM, Joseph Tremoulet via lldb-dev <lldb-dev at lists.llvm.org> wrote: > > Hi, > > I’m noticing an unexpected difference between targets when I hit a dbgtrap in the debugger. Consider this simple llvm function: > > define void @do_break() { > entry: > call void @llvm.debugtrap() > ret void > } > > If I compile that with llc and use lldb to launch a program that calls it, on x86_64 linux (Ubuntu 18.04), here’s what I see at the stop: > > Process 130404 stopped > * thread #1, name = 'doit', stop reason = signal SIGTRAP > frame #0: 0x0000000000400541 doit`do_break at stub.ll:2:1 > (lldb) disas > doit`do_break: > 0x400540 <+0>: int3 > -> 0x400541 <+1>: retq > (lldb) register read rip > rip = 0x0000000000400541 doit`do_break + 1 at stub.ll:2:1 > > Note that rip is reported as pointing to the next instruction after the int3. If I ‘continue’ from there, the program continues doing whatever was after the debugtrap. > > > If I follow the same steps on aarch64 (also Ubuntu 18.04), I see this: > > Process 21586 stopped > * thread #1, name = 'doit', stop reason = signal SIGTRAP > frame #0: 0x00000000004005dc doit`do_break at stub.ll:1:1 > (lldb) disas > doit`do_break: > -> 0x4005dc <+0>: brk #0x1 > 0x4005e0 <+4>: ret > (lldb) register read pc > pc = 0x00000000004005dc doit`do_break at stub.ll:1:1 > > Note that here, pc is reported as pointing at the ‘brk’ instruction itself. If I ‘continue’ from there, I immediately find myself stopped back at the same point, ad infinitum. > > From what I can tell, GDB also “gets stuck” when it hits this instruction on aarch64 (and also doesn’t on x86_64). > > I’m wondering what to make of this / where’s the “bug”. > • Should llvm use a different lowering for dbgtrap on aarch64-linux? I don’t think so, it seems to be standard > • Should the system signal handler be reporting an incremented pc in the context struct when it hits brk? > • And even if so, what should the workaround be for systems without such a fix? > • Should lldb (and gdb for that matter) somehow recognize this case, and increment pc when stopping or resuming at a brk? > • Is this just unsupported, is continuing past a debugtrap UB or otherwise disallowed? > > I’d appreciate any insights here. > > Thanks, > -Joseph > > _______________________________________________ > lldb-dev mailing list > lldb-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
Pavel Labath via llvm-dev
2020-Mar-05 07:29 UTC
[llvm-dev] [lldb-dev] Continuing from dbgtrap on different targets
On 04/03/2020 21:45, Jim Ingham via llvm-dev wrote:> As you have seen, different machine architectures do different things after hitting a trap. On x86_64, the trap instruction is executed, and then you stop, so the PC is left after the stop. On arm64, when execution halts the pc is still pointing at the trap instruction. > > I don't think lldb should be in the business of telling systems how they should report stops, especially since that is certainly something we can handle in lldb. > > For traps that lldb recognizes as ones it is using for breakpoints, it already has to handle this difference for you. But for traps we know nothing about we don't do anything special. > > I think it would be entirely reasonable that whenever lldb encounters a trap instruction that isn't one of ours it should always move the PC after the trap before returning control to the user. I can't see why you would want to keep hitting the trap over and over. I've received several bugs (on the Apple bug reporter side) for this feature. This might be something we teach lldb-server & debugserver to do, rather than lldb but that's an implementation detail... > > For now, on architectures where the trap doesn't execute, you just need to move the pc past the trap by hand (with the "thread jump" command) before continuing. That has always been safe on arm64 so far as I can tell. > > JimYes, this is something that has bugged me too. While I think it would be nice if the OSes hid these architecture quirks (hell, I think it would be nice if the CPU manufacturers made this consistent so that the OS doesn't need to hide it), I think that changing that at this point is very unlikely, and so working around it in lldb is probably the best we can do. I am not sure what is the official position on continuing from a debug trap, but I think that without that ability, the concept would be pretty useless. A quick example <https://godbolt.org/z/-8voBz> shows that clang produces the "expected" output even at -O3. In fact, on aarch64, __builtin_debugtrap() and __builtin_trap() produce the same instruction, and the only difference between them is that the latter also triggers DCE of everything coming after it. pl