Alexander Potapenko via llvm-dev
2018-Sep-25 10:10 UTC
[llvm-dev] Obtaining the origin function for a local var after inlining
On Wed, Sep 19, 2018 at 5:18 PM Adrian Prantl <aprantl at apple.com> wrote:> > > > > On Sep 19, 2018, at 4:08 AM, Alexander Potapenko <glider at google.com> wrote: > > > > On Tue, Sep 18, 2018 at 1:56 AM Adrian Prantl <aprantl at apple.com> wrote: > >> > >> > >> > >>> On Sep 17, 2018, at 6:59 AM, Alexander Potapenko via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >>> > >>> (I think I've asked a similar question off-list a couple of times, but > >>> never got an answer) > >>> > >>> Hi folks, > >>> > >>> For [K]MSAN we need to figure out which inlined function a local var > >>> originally belonged to in the source file. > >> > >> If you are looking at a llvm.dbg.declar/value/addr intrinsic, then the DILocation attached to the intrinsic indirectly points there: > >> > >> DIScope *Scope = DILocation(dbg_intrinsic.getDebugLoc()).getScope(); > >> while (!isa<DISubprogram>(Scope)) > >> Scope = Scope->getScope(); > >> auto *origFunction = cast<DIFunction>(Scope); > > This works, thank you! > > > > (I had to slightly modify the code FWIW: > > DILocation *DIL = dbg_intrinsic.getDebugLoc(); > > if (DIL) { > > DIScope *Scope = DIL->getScope(); > > while (Scope && !isa<DISubprogram>(Scope)) > > Scope = Scope->getScope().resolve(); > > auto *origFunction = cast<DISubprogram>(Scope) > > ) > > > > I also thought that it would be natural if the AllocaInst > > corresponding to the llvm.dbg.declare() call will share the same > > DILocation as the debug intrinsic. > > Does anyone have an idea why this isn't so? > > First off, this is up to the frontend to decide. But generally, an alloca is almost always part of the function prologue and the DILocation assigned to it is almost meaningless because it won't (directly) get generated into any code that could be associated with a dbeug line table entry. The second reason is that it could be that the dbg.declare is inlined and describing an sret value where the alloca belngs to the call site's stack frame. > > > Right now one needs to build a mapping between AllocaInst and > > llvm.dbg.declare() in order to get the debug info for the allocation. > > No, you can just call llvm::findDbgUsers() to find any debug intrinsics referring to any llvm::Instruction.It also turns out that for certain AllocaInst instances there're no llvm.debug.declare intrinsics referring to them, only several different llvm.dbg.value calls. For most of them the DILocation match that of the inlined local variable, however some reference other code locations, e.g. places where these allocas are passed as function parameters. For example, in the attached IR file (ptrace.ll, generated from the attached ptrace.c) the following llvm.debug.value() calls reference the |siginfo| variable declared at line 888 in ptrace.c: %siginfo = alloca %struct.siginfo, align 8 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !6801, metadata !DIExpression()) #6, !dbg !7046 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !7022, metadata !DIExpression()) #6, !dbg !7027 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !6793, metadata !DIExpression()) #6, !dbg !6967 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !6937, metadata !DIExpression()) #6, !dbg !6942 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !6556, metadata !DIExpression(DW_OP_deref)), !dbg !6931 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !6556, metadata !DIExpression(DW_OP_deref)), !dbg !6931 call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, metadata !6556, metadata !DIExpression(DW_OP_deref)), !dbg !6931 E.g. here the second and the fourth intrinsics have debug info values !7027 and !6942 pointing at lines 670 and 654 respectively (siginfo_t* parameters of ptrace_setsiginfo() and ptrace_getsiginfo()) The last three intrinsics indeed point to line 888, where |siginfo| is declared. Is the DW_OP_deref tag enough to distinguish the right llvm.dbg.value?> -- adrian > > > > >> if you want to find the function that it was inlined *into* then you need to follow the inlinedAt link in the DILoation. > >> > >> -- adrian > >> > >>> E.g. when a local buffer %buf is declared in @bar(), but @bar() is > >>> inlined into @foo(), then there's a local %buf.i in @foo(), but we > >>> need to determine that the local came from @bar(). In the case of > >>> nested inline functions we need the deepest one. > >>> > >>> Is there any existing code for that? If not, which debug info > >>> constructs do we need to look up to get this information? > >>> > >>> https://llvm.org/docs/SourceLevelDebugging.html mentions > >>> @llvm.dbg.addr as the source of information about a local var, but the > >>> ToT Clang doesn't emit it. There're calls to @llvm.debug.declare in > >>> the IR, but it's said to be deprecated, so I'm not sure if it's ok to > >>> use it. > >>> > >>> Thanks in advance, > >>> -- > >>> Alexander Potapenko > >>> Software Engineer > >>> > >>> Google Germany GmbH > >>> Erika-Mann-Straße, 33 > >>> 80636 München > >>> > >>> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > >>> Registergericht und -nummer: Hamburg, HRB 86891 > >>> Sitz der Gesellschaft: Hamburg > >>> _______________________________________________ > >>> LLVM Developers mailing list > >>> llvm-dev at lists.llvm.org > >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >> > > > > > > -- > > Alexander Potapenko > > Software Engineer > > > > Google Germany GmbH > > Erika-Mann-Straße, 33 > > 80636 München > > > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > > Registergericht und -nummer: Hamburg, HRB 86891 > > Sitz der Gesellschaft: Hamburg >-- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg -------------- next part -------------- A non-text attachment was scrubbed... Name: ptrace.ll Type: application/octet-stream Size: 1724394 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180925/41277376/attachment-0001.obj> -------------- next part -------------- A non-text attachment was scrubbed... Name: ptrace.c Type: text/x-csrc Size: 33164 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180925/41277376/attachment-0001.c>
Adrian Prantl via llvm-dev
2018-Sep-25 15:38 UTC
[llvm-dev] Obtaining the origin function for a local var after inlining
> On Sep 25, 2018, at 3:10 AM, Alexander Potapenko <glider at google.com> wrote: > > On Wed, Sep 19, 2018 at 5:18 PM Adrian Prantl <aprantl at apple.com> wrote: >> >> >> >>> On Sep 19, 2018, at 4:08 AM, Alexander Potapenko <glider at google.com> wrote: >>> >>> On Tue, Sep 18, 2018 at 1:56 AM Adrian Prantl <aprantl at apple.com> wrote: >>>> >>>> >>>> >>>>> On Sep 17, 2018, at 6:59 AM, Alexander Potapenko via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>>>> >>>>> (I think I've asked a similar question off-list a couple of times, but >>>>> never got an answer) >>>>> >>>>> Hi folks, >>>>> >>>>> For [K]MSAN we need to figure out which inlined function a local var >>>>> originally belonged to in the source file. >>>> >>>> If you are looking at a llvm.dbg.declar/value/addr intrinsic, then the DILocation attached to the intrinsic indirectly points there: >>>> >>>> DIScope *Scope = DILocation(dbg_intrinsic.getDebugLoc()).getScope(); >>>> while (!isa<DISubprogram>(Scope)) >>>> Scope = Scope->getScope(); >>>> auto *origFunction = cast<DIFunction>(Scope); >>> This works, thank you! >>> >>> (I had to slightly modify the code FWIW: >>> DILocation *DIL = dbg_intrinsic.getDebugLoc(); >>> if (DIL) { >>> DIScope *Scope = DIL->getScope(); >>> while (Scope && !isa<DISubprogram>(Scope)) >>> Scope = Scope->getScope().resolve(); >>> auto *origFunction = cast<DISubprogram>(Scope) >>> ) >>> >>> I also thought that it would be natural if the AllocaInst >>> corresponding to the llvm.dbg.declare() call will share the same >>> DILocation as the debug intrinsic. >>> Does anyone have an idea why this isn't so? >> >> First off, this is up to the frontend to decide. But generally, an alloca is almost always part of the function prologue and the DILocation assigned to it is almost meaningless because it won't (directly) get generated into any code that could be associated with a dbeug line table entry. The second reason is that it could be that the dbg.declare is inlined and describing an sret value where the alloca belngs to the call site's stack frame. >> >>> Right now one needs to build a mapping between AllocaInst and >>> llvm.dbg.declare() in order to get the debug info for the allocation. >> >> No, you can just call llvm::findDbgUsers() to find any debug intrinsics referring to any llvm::Instruction. > It also turns out that for certain AllocaInst instances there're no > llvm.debug.declare intrinsics referring to them, only several > different llvm.dbg.value calls.In clang, all variables that are stored in allocas are described by dbg.declares in the frontend, b ut later LLVM transformations may lower them to dbg.values. In optimized code it is expected that you will only see dbg.declares for variables whose address is actually taken. - A dbg.declare declares a variable lives in a particular memory location. Its DILocation usually points to the declaration of the variable. - A dbg.value says that the result of a computation (an LLVM SSA value) is the current value of a source variable. A dbg.value's DILocation points typically to the location of that computation, though we aren't particularly consistent about that and the DILocation is really only used for its inlinedAt field in the backend.> For most of them the DILocation match that of the inlined local > variable, however some reference other code locations, e.g. places > where these allocas are passed as function parameters. > > For example, in the attached IR file (ptrace.ll, generated from the > attached ptrace.c) the following llvm.debug.value() calls reference > the |siginfo| variable declared at line 888 in ptrace.c: > > %siginfo = alloca %struct.siginfo, align 8 > call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, > metadata !6801, metadata !DIExpression()) #6, !dbg !7046 > call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo,this is !6801 = !DILocalVariable(name: "from", arg: 2, scope: !6794, file: !6795, line: 14, type: !6798)> metadata !7022, metadata !DIExpression()) #6, !dbg !7027 > call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo,this is !7022 = !DILocalVariable(name: "info", arg: 2, scope: !7016, file: !3, line: 670, type: !7019)> metadata !6793, metadata !DIExpression()) #6, !dbg !6967 > call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, > metadata !6937, metadata !DIExpression()) #6, !dbg !6942 > call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, > metadata !6556, metadata !DIExpression(DW_OP_deref)), !dbg !6931the last three are describing !6556 = !DILocalVariable(name: "siginfo", scope: !6546, file: !3, line: 888, type: !5009) the last two look redundant (probably due to a bug in whatever transformation inserted them thrice).> call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, > metadata !6556, metadata !DIExpression(DW_OP_deref)), !dbg !6931 > call void @llvm.dbg.value(metadata %struct.siginfo* %siginfo, > metadata !6556, metadata !DIExpression(DW_OP_deref)), !dbg !6931 > E.g. here the second and the fourth intrinsics have debug info values > !7027 and !6942 pointing at lines 670 and 654 respectively (siginfo_t* > parameters of ptrace_setsiginfo() and ptrace_getsiginfo()) > The last three intrinsics indeed point to line 888, where |siginfo| is declared. > > Is the DW_OP_deref tag enough to distinguish the right llvm.dbg.value?I don't understand what you mean by "right" here. These intrinsics are describing different inlined variables that happen to share the same value at htis point in the program. -- adrian> >> -- adrian >> >>> >>>> if you want to find the function that it was inlined *into* then you need to follow the inlinedAt link in the DILoation. >>>> >>>> -- adrian >>>> >>>>> E.g. when a local buffer %buf is declared in @bar(), but @bar() is >>>>> inlined into @foo(), then there's a local %buf.i in @foo(), but we >>>>> need to determine that the local came from @bar(). In the case of >>>>> nested inline functions we need the deepest one. >>>>> >>>>> Is there any existing code for that? If not, which debug info >>>>> constructs do we need to look up to get this information? >>>>> >>>>> https://llvm.org/docs/SourceLevelDebugging.html mentions >>>>> @llvm.dbg.addr as the source of information about a local var, but the >>>>> ToT Clang doesn't emit it. There're calls to @llvm.debug.declare in >>>>> the IR, but it's said to be deprecated, so I'm not sure if it's ok to >>>>> use it. >>>>> >>>>> Thanks in advance, >>>>> -- >>>>> Alexander Potapenko >>>>> Software Engineer >>>>> >>>>> Google Germany GmbH >>>>> Erika-Mann-Straße, 33 >>>>> 80636 München >>>>> >>>>> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado >>>>> Registergericht und -nummer: Hamburg, HRB 86891 >>>>> Sitz der Gesellschaft: Hamburg >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>> >>> >>> -- >>> Alexander Potapenko >>> Software Engineer >>> >>> Google Germany GmbH >>> Erika-Mann-Straße, 33 >>> 80636 München >>> >>> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado >>> Registergericht und -nummer: Hamburg, HRB 86891 >>> Sitz der Gesellschaft: Hamburg >> > > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > <ptrace.ll><ptrace.c>