Hsiangkai Wang via llvm-dev
2018-Mar-30 06:29 UTC
[llvm-dev] [RFC] Generate Debug Information for Labels in Function
On Fri, Mar 30, 2018 at 12:05 AM, Adrian Prantl <aprantl at apple.com> wrote:>> >> On Mar 27, 2018, at 7:41 PM, Hsiangkai Wang via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hello all, >> >> I would like to enhance LLVM debug info that supports setting >> breakpoint on labels in function. >> >> Generally, if users use GDB as their debugger, they could set >> breakpoints on labels in function. Following is an example. >> >> // C program >> static int >> myfunction (int arg) >> { >> int i, j, r; >> >> j = 0; /* myfunction location */ >> r = arg; >> >> top: >> ++j; /* top location */ >> >> if (j == 10) >> goto done; >> >> for (i = 0; i < 10; ++i) >> { >> r += i; >> if (j % 2) >> goto top; >> } >> >> done: >> return r; >> } >> >> int >> main (void) >> { >> int i, j; >> >> for (i = 0, j = 0; i < 1000; ++i) >> j += myfunction (0); >> >> return 0; >> } >> >> Following is the GDB commands to illustrate how to set breakpoints on labels. >> >> (gdb) b main >> Breakpoint 1 at 0x10298: file explicit.c, line 50. >> (gdb) r >> Starting program: /home/users/kai/sandbox/gdbtest/explicit-gcc >> >> Breakpoint 1, main () at explicit.c:50 >> 50 for (i = 0, j = 0; i < 1000; ++i) >> (gdb) b myfunction:top >> Breakpoint 2 at 0x10214: file explicit.c, line 26. >> (gdb) c >> Continuing. >> >> Breakpoint 2, myfunction (arg=0) at explicit.c:27 >> 27 ++j; /\* top location */ >> (gdb) >> >> However, LLVM does not generate debug information for labels. So, the >> feature could not work for binaries generated by clang. I also found >> that the problem is reported in PR35526 and PR36420. I propose an >> implementation plan to support debug information for labels. > > Thank you for working on this! I think it would be good to support labels better. IIRC it currently only generates them from assembler sources in place of a DW_TAG_subprogram. > > >> Following are the steps I propose to implement the feature. >> >> 1. Define debug metadata and intrinsic functions for labels. >> >> First of all, we need to record debug information in LLVM IR. In LLVM >> IR, LLVM uses metadata and intrinsic function to keep debug >> information. So, I need to define new kind of metadata, DILabel, and >> new intrinsic function, llvm.dbg.label, to associate DILabel with >> label statement. >> >> DILabel will contain name of the label, file metadata, line number, >> and scope metadata. >> >> Intrinsic function llvm.dbg.label uses DILabel metadata as its parameter. > > Looking at your testcase in https://reviews.llvm.org/D45043 > > > br label %top > > top: > call void @llvm.dbg.label(metadata !10), !dbg !11 > %0 = load i32, i32* %a.addr, align 4 > > Modelling the IR this way is problematic. In a llvm.dbg.value intrinsic we tie the SSA value the intrinsic describes to the intrinsic by making it an explicit argument of the intrinsic. In the example above, this is not the case, and optimizations will likely move the label and the intrinsic further apart, or even duplicate the intrinsic during loop unrolling. If you want to have additional metadata for a label, I think it would be better to allow a BasicBlock to carry a !dbg attachment. In IR assembler this could look like this: > > top, !label !10, !dbg !11: >I agree with you. Attach debug metadata to basic block will be a better solution. I will change my design to convey debug metadata through basic block instead of intrinsic. https://reviews.llvm.org/D45078> That said, perhaps this isn't even necessary. The only information that is stored in DILabel is the name of the label (which is redundant with the actual name of the label) and its source location, which is also stored in the DILocation (!11). I'm wondering if the DILocation of a label is even useful. When a debugger user sets a breakpoint of a label, we might as well use the location of the first instruction in the basic block described by the label, since that is where execution will continue. > > Based on that I think it might be sufficient to have a flag on an IR label that marks a user-originated label and triggers the backend to create a DW_TAG_label for it. If we do need source location information for the DW_TAG_label, we could grab it from the first instruction.I still think that we should collect debug information from source code level instead of infer from instructions in the basic block. As Paul said, "the top instructions in a block do not necessarily have a valid source location." So, I will keep DILabel metadata and remove llvm.dbg.label intrinsic.> > Let me know what you think! > -- adrian > >> >> 2. Create MI instruction DBG_LABEL. >> >> I create new MI instruction DBG_LABEL to keep debug information after >> LLVM IR converted to MI. >> >> DBG_LABEL uses DILabel metadata as its parameter. >> >> 3. Create data structure, SDDbgLabel, to store debug information of >> labels in SelectionDAG. >> >> In SelectionDAG, we need a data structure to keep debug information of >> label. It will keep DILabel metadata. >> >> 4. Convert SDDbgLabel to DBG_LABEL in SelectionDAG. >> >> After EmitSchedule(), SelectionDAG will be converted to a list of MI >> instructions. In the function, we will generate DBG_LABEL MachineInstr >> from SDDbgLabel. >> >> For FastISel and GlobalISel, we could convert llvm.dbg.label to >> DBG_LABEL directly. >> >> 5. Collect debug information of labels from MI listing to DebugHandlerBase. >> >> Before generating actual debug information in assembly format or >> object format, we need to keep debug format-independent data in >> DebugHandlerBase. Afterwards, we could convert these data to CodeView >> format or DWARF format. >> >> 6. Create DWARF DIE specific data structure in DwarfDebug. >> >> In class DwarfDebug, we keep DWARF specific data structure for DILabel. >> >> 7. Generate DW_TAG_label and fill details of DW_TAG_label. >> >> Finally, generating DW_TAG_label DIE and its attributes into DIE structure. >> >> I am looking forward to any thoughts & feedback! >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Adrian Prantl via llvm-dev
2018-Mar-30 16:25 UTC
[llvm-dev] [RFC] Generate Debug Information for Labels in Function
> On Mar 29, 2018, at 11:29 PM, Hsiangkai Wang <hsiangkai at gmail.com> wrote: > > I agree with you. Attach debug metadata to basic block will be a > better solution. I will change my design to convey debug metadata > through basic block instead of intrinsic. > > https://reviews.llvm.org/D45078In this revised design it is now possible to attach a DILabel to a BasicBlock. When the basic block is inlined it will be ambiguous to which function the DILabel belongs. For instructions, we store the inline information in the inlinedAt: field of its DILocation. In order to handle inlining for DILabels we have two options: 1. Also attach a DILocation to be associated with the label to carry the inline information, and teach the inliner to correctly update the DILocation on basic blocks during inlining. This would also solve the issue of hypothetical scoped labels that Paul brought up. We'll also need to figure out what to do when two labels are being merged by a transformation. 2. Teach the inliner to drop all metadata attachments on basic blocks. Option (2) is obviously going to be easier to implement and might be a good as a first step.> >> That said, perhaps this isn't even necessary. The only information that is stored in DILabel is the name of the label (which is redundant with the actual name of the label) and its source location, which is also stored in the DILocation (!11). I'm wondering if the DILocation of a label is even useful. When a debugger user sets a breakpoint of a label, we might as well use the location of the first instruction in the basic block described by the label, since that is where execution will continue. >> >> Based on that I think it might be sufficient to have a flag on an IR label that marks a user-originated label and triggers the backend to create a DW_TAG_label for it. If we do need source location information for the DW_TAG_label, we could grab it from the first instruction. > I still think that we should collect debug information from source > code level instead of infer from instructions in the basic block. As > Paul said, "the top instructions in a block do not necessarily have a > valid source location." So, I will keep DILabel metadata and remove > llvm.dbg.label intrinsic.I'm still not convinced that this information will be useful to a debugger, but if you have a compelling use-case please let me know. -- adrian
Adrian Prantl via llvm-dev
2018-Mar-30 16:39 UTC
[llvm-dev] [RFC] Generate Debug Information for Labels in Function
> On Mar 30, 2018, at 9:25 AM, Adrian Prantl via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > > >> On Mar 29, 2018, at 11:29 PM, Hsiangkai Wang <hsiangkai at gmail.com> wrote: >> >> I agree with you. Attach debug metadata to basic block will be a >> better solution. I will change my design to convey debug metadata >> through basic block instead of intrinsic. >> >> https://reviews.llvm.org/D45078 > > In this revised design it is now possible to attach a DILabel to a BasicBlock. When the basic block is inlined it will be ambiguous to which function the DILabel belongs. For instructions, we store the inline information in the inlinedAt: field of its DILocation. In order to handle inlining for DILabels we have two options: > > 1. Also attach a DILocation to be associated with the label to carry the inline information, and teach the inliner to correctly update the DILocation on basic blocks during inlining. This would also solve the issue of hypothetical scoped labels that Paul brought up. We'll also need to figure out what to do when two labels are being merged by a transformation. > > 2. Teach the inliner to drop all metadata attachments on basic blocks. > > Option (2) is obviously going to be easier to implement and might be a good as a first step.I'm really sorry for not realizing this yesterday, but the problems pertaining to inlining made me realize that your original design with the dbg.label intrinsic might actually be a better approach especially when considering optimized code. We will get inlining support for free because it is just another instruction and it can deal with more than one label at the same address. It looks a bit more complicated in unoptimized code, but that seems like a small price to pay. We just need to make sure that the backend doesn't get confused when loop unrolling duplicates a dbg.label but that should be doable. -- adrian> >> >>> That said, perhaps this isn't even necessary. The only information that is stored in DILabel is the name of the label (which is redundant with the actual name of the label) and its source location, which is also stored in the DILocation (!11). I'm wondering if the DILocation of a label is even useful. When a debugger user sets a breakpoint of a label, we might as well use the location of the first instruction in the basic block described by the label, since that is where execution will continue. >>> >>> Based on that I think it might be sufficient to have a flag on an IR label that marks a user-originated label and triggers the backend to create a DW_TAG_label for it. If we do need source location information for the DW_TAG_label, we could grab it from the first instruction. >> I still think that we should collect debug information from source >> code level instead of infer from instructions in the basic block. As >> Paul said, "the top instructions in a block do not necessarily have a >> valid source location." So, I will keep DILabel metadata and remove >> llvm.dbg.label intrinsic. > > I'm still not convinced that this information will be useful to a debugger, but if you have a compelling use-case please let me know. > > -- adrian > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reasonably Related Threads
- [RFC] Generate Debug Information for Labels in Function
- [RFC] Generate Debug Information for Labels in Function
- [RFC] Generate Debug Information for Labels in Function
- [RFC] Generate Debug Information for Labels in Function
- [RFC] Generate Debug Information for Labels in Function