via llvm-dev
2021-May-06 18:06 UTC
[llvm-dev] [debug-info] Stack pointer based variable locations
In functions without a frame pointer, emitting a one-instruction location range at every push/pop (for a potentially large set of stack-homed local variables) does seem like it would take up a lot of space for not much real-world benefit. On the other hand, having a location range that covers the actual call instruction (or I suppose, more precisely, its return address) would make the locals available to the user when the debugger is stopped in the callee, and that seems *very* valuable. Wondering what other people think. --paulr From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of via llvm-dev Sent: Thursday, May 6, 2021 1:51 PM To: llvm-dev at lists.llvm.org Subject: [llvm-dev] [debug-info] Stack pointer based variable locations Hello llvm-dev, I've noticed some behaviour I found surprising with the way that we emit stack pointer relative variable locations. It seems that locations defined by DBG_VALUEs that are written in terms of RSP (for x86) are terminated by any stack manipulation operations, e.g. pushing arguments before a call. Since we know the stack offset at each adjustment it seems like we could maintain the variable location by generating location list entries with adjusted RSP offsets. Here's a source reproducer with clang built at 71597d40e878 (recent), target x86_64-unknown-linux-gnu). $ cat test.cpp void ext(int, int, int, int, int, int, int, int, int, int); void escape(int*); int example() { int local = 0; escape(&local); ext(0, 1, 2, 3, 4, 5, 6, 7, 8, 9); local += 2; return local; } $ clang -O2 -g -c test.cpp -o test.o $ llvm-objdump -d test.o test.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000 <_Z7examplev>: 0: 50 pushq %rax 1: c7 44 24 04 00 00 00 00 movl $0, 4(%rsp) 9: 48 8d 7c 24 04 leaq 4(%rsp), %rdi e: e8 00 00 00 00 callq 0x13 <_Z7examplev+0x13> 13: 31 ff xorl %edi, %edi 15: be 01 00 00 00 movl $1, %esi 1a: ba 02 00 00 00 movl $2, %edx 1f: b9 03 00 00 00 movl $3, %ecx 24: 41 b8 04 00 00 00 movl $4, %r8d 2a: 41 b9 05 00 00 00 movl $5, %r9d 30: 6a 09 pushq $9 32: 6a 08 pushq $8 34: 6a 07 pushq $7 36: 6a 06 pushq $6 38: e8 00 00 00 00 callq 0x3d <_Z7examplev+0x3d> 3d: 48 83 c4 20 addq $32, %rsp 41: 8b 44 24 04 movl 4(%rsp), %eax 45: 83 c0 02 addl $2, %eax 48: 59 popq %rcx 49: c3 retq $ llvm-dwarfdump test.o --name local test.o: file format elf64-x86-64 0x00000047: DW_TAG_variable DW_AT_location (0x00000000: [0x0000000000000001, 0x0000000000000009): DW_OP_consts +0, DW_OP_stack_value [0x0000000000000009, 0x0000000000000032): DW_OP_breg7 RSP+4 [0x0000000000000045, 0x000000000000004a): DW_OP_reg0 RAX) DW_AT_name ("local") DW_AT_decl_file ("/home/och/dev/bugs/scratch/test.cpp") DW_AT_decl_line (4) DW_AT_type (0x000000ad "int") The variable 'local' is not given a location over the interval [32, 45) even though we know where it is (RSP+8, RSP+12, ..., back to RSP+4 after the stack adjustment following the call). It seems unfortunate to lose variable locations in this way, especially around call sites. Is this a deliberate omission, perhaps made in order to save space? Jeremy mentioned that we do something similar in prologues/epilogues to avoid generating large location lists. Many thanks, Orlando -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210506/5905dc68/attachment.html>
David Blaikie via llvm-dev
2021-May-07 05:48 UTC
[llvm-dev] [debug-info] Stack pointer based variable locations
Re; prologue: Locations aren't valid in the prologue, so for instance we can give a simple location description of "stack offset 4" for a parameter at -O0, despite the fact that the parameter isn't at stack offset 4 until after we run the prologue that takes the ABI register and stores it into that stack offset. That's handy - because otherwise every parameter would have to use location lists at -O0, which would be a lot of space to spend. Looks like GCC manages to use DW_OP_fbreg for the example given, which looks like it works/is correct despite the the pushes/pops (because it's rbp, I guess - base pointer rather than stack pointer). Perhaps we could do something like that too, in cases like this? On Thu, May 6, 2021 at 11:06 AM via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > In functions without a frame pointer, emitting a one-instruction location range at every push/pop (for a potentially large set of stack-homed local variables) does seem like it would take up a lot of space for not much real-world benefit. > > On the other hand, having a location range that covers the actual call instruction (or I suppose, more precisely, its return address) would make the locals available to the user when the debugger is stopped in the callee, and that seems *very* valuable. > > Wondering what other people think. > > --paulr > > > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of via llvm-dev > Sent: Thursday, May 6, 2021 1:51 PM > To: llvm-dev at lists.llvm.org > Subject: [llvm-dev] [debug-info] Stack pointer based variable locations > > > > Hello llvm-dev, > > > > I've noticed some behaviour I found surprising with the way that we emit stack pointer relative > > variable locations. It seems that locations defined by DBG_VALUEs that are written in terms of RSP > > (for x86) are terminated by any stack manipulation operations, e.g. pushing arguments before a > > call. Since we know the stack offset at each adjustment it seems like we could maintain the variable > > location by generating location list entries with adjusted RSP offsets. > > > > Here's a source reproducer with clang built at 71597d40e878 (recent), target > > x86_64-unknown-linux-gnu). > > > > $ cat test.cpp > > void ext(int, int, int, int, int, int, int, int, int, int); > > void escape(int*); > > int example() { > > int local = 0; > > escape(&local); > > ext(0, 1, 2, 3, 4, 5, 6, 7, 8, 9); > > local += 2; > > return local; > > } > > > > $ clang -O2 -g -c test.cpp -o test.o > > > > $ llvm-objdump -d test.o > > test.o: file format elf64-x86-64 > > Disassembly of section .text: > > 0000000000000000 <_Z7examplev>: > > 0: 50 pushq %rax > > 1: c7 44 24 04 00 00 00 00 movl $0, 4(%rsp) > > 9: 48 8d 7c 24 04 leaq 4(%rsp), %rdi > > e: e8 00 00 00 00 callq 0x13 <_Z7examplev+0x13> > > 13: 31 ff xorl %edi, %edi > > 15: be 01 00 00 00 movl $1, %esi > > 1a: ba 02 00 00 00 movl $2, %edx > > 1f: b9 03 00 00 00 movl $3, %ecx > > 24: 41 b8 04 00 00 00 movl $4, %r8d > > 2a: 41 b9 05 00 00 00 movl $5, %r9d > > 30: 6a 09 pushq $9 > > 32: 6a 08 pushq $8 > > 34: 6a 07 pushq $7 > > 36: 6a 06 pushq $6 > > 38: e8 00 00 00 00 callq 0x3d <_Z7examplev+0x3d> > > 3d: 48 83 c4 20 addq $32, %rsp > > 41: 8b 44 24 04 movl 4(%rsp), %eax > > 45: 83 c0 02 addl $2, %eax > > 48: 59 popq %rcx > > 49: c3 retq > > > > $ llvm-dwarfdump test.o --name local > > test.o: file format elf64-x86-64 > > 0x00000047: DW_TAG_variable > > DW_AT_location (0x00000000: > > [0x0000000000000001, 0x0000000000000009): DW_OP_consts +0, DW_OP_stack_value > > [0x0000000000000009, 0x0000000000000032): DW_OP_breg7 RSP+4 > > [0x0000000000000045, 0x000000000000004a): DW_OP_reg0 RAX) > > DW_AT_name ("local") > > DW_AT_decl_file ("/home/och/dev/bugs/scratch/test.cpp") > > DW_AT_decl_line (4) > > DW_AT_type (0x000000ad "int") > > > > The variable 'local' is not given a location over the interval [32, 45) even though we > > know where it is (RSP+8, RSP+12, ..., back to RSP+4 after the stack adjustment following the > > call). It seems unfortunate to lose variable locations in this way, especially around call sites. Is > > this a deliberate omission, perhaps made in order to save space? Jeremy mentioned that we do > > something similar in prologues/epilogues to avoid generating large location lists. > > Many thanks, > > Orlando > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev