Pablo González de Aledo
2015-Apr-23  01:09 UTC
[LLVMdev] Get precise line/column debug info from LLVM IR
I am trying to locate instructions in an LLVM Pass by line and column
number (reported by an third-party tool) to instrument them. To achieve
this, I am compiling my source files with `clang -g -O0 -emit-llvm` and
looking for the information in the metadata using this code:
    const DebugLoc &location = instruction->getDebugLoc();
    // location.getLine()
    // location.getCol()
Unfortunately, this information is absolutely imprecise. Consider the
following implementation of the Fibonacci function:
    unsigned fib(unsigned n) {
        if (n < 2)
            return n;
        unsigned f = fib(n - 1) + fib(n - 2);
        return f;
    }
I would like to locate the single LLVM instruction corresponding to the
assignment `unsigned f = ...` in the resulting LLVM IR. I am not interested
in all the calculations of the right-hand side. The generated LLVM block
including relevant debug metadata is:
    [...]
    if.end:                                           ; preds = %entry
      call void @llvm.dbg.declare(metadata !{i32* %f}, metadata !17), !dbg
!18
      %2 = load i32* %n.addr, align 4, !dbg !19
      %sub = sub i32 %2, 1, !dbg !19
      %call = call i32 @fib(i32 %sub), !dbg !19
      %3 = load i32* %n.addr, align 4, !dbg !20
      %sub1 = sub i32 %3, 2, !dbg !20
      %call2 = call i32 @fib(i32 %sub1), !dbg !20
      %add = add i32 %call, %call2, !dbg !20
      store i32 %add, i32* %f, align 4, !dbg !20
      %4 = load i32* %f, align 4, !dbg !21
      store i32 %4, i32* %retval, !dbg !21
      br label %return, !dbg !21
    [...]
    !17 = metadata !{i32 786688, metadata !4, metadata !"f", metadata
!5,
i32 5, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [f] [line 5]
    !18 = metadata !{i32 5, i32 11, metadata !4, null}
    !19 = metadata !{i32 5, i32 15, metadata !4, null}
    !20 = metadata !{i32 5, i32 28, metadata !4, null}
    !21 = metadata !{i32 6, i32 2, metadata !4, null}
    !22 = metadata !{i32 7, i32 1, metadata !4, null}
As you can see, the metadata `!dbg !20` of the `store` instruction points
to **line 5 column 28**, which is the call to `fib(n - 2)`. Even worse, the
add operation and the subtraction `n - 2` both also point to that function
call, identified by `!dbg !20`.
Interestingly, the Clang AST emitted by `clang -Xclang -ast-dump
-fsyntax-only` has all that information. Thus, I suspect that it is somehow
lost during the code generation phase. It seems that during code generation
Clang reaches some internal sequence point and associates all following
instructions to that position until the next sequence point (e.g. function
call) occurs. For completeness, here is the declaration statement in the
AST:
    |-DeclStmt 0x7ffec3869f48 <line:5:2, col:38>
    | `-VarDecl 0x7ffec382d680 <col:2, col:37> col:11 used f 'unsigned
int'
cinit
    |   `-BinaryOperator 0x7ffec3869f20 <col:15, col:37> 'unsigned
int' '+'
    |     |-CallExpr 0x7ffec382d7e0 <col:15, col:24> 'unsigned
int'
    |     | |-ImplicitCastExpr 0x7ffec382d7c8 <col:15> 'unsigned int
(*)(unsigned int)' <FunctionToPointerDecay>
    |     | | `-DeclRefExpr 0x7ffec382d6d8 <col:15> 'unsigned int
(unsigned
int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned
int)'
    |     | `-BinaryOperator 0x7ffec382d778 <col:19, col:23> 'unsigned
int'
'-'
    |     |   |-ImplicitCastExpr 0x7ffec382d748 <col:19> 'unsigned
int'
<LValueToRValue>
    |     |   | `-DeclRefExpr 0x7ffec382d700 <col:19> 'unsigned
int' lvalue
ParmVar 0x7ffec382d3d0 'n' 'unsigned int'
    |     |   `-ImplicitCastExpr 0x7ffec382d760 <col:23> 'unsigned
int'
<IntegralCast>
    |     |     `-IntegerLiteral 0x7ffec382d728 <col:23> 'int' 1
    |     `-CallExpr 0x7ffec3869ef0 <col:28, col:37> 'unsigned
int'
    |       |-ImplicitCastExpr 0x7ffec3869ed8 <col:28> 'unsigned int
(*)(unsigned int)' <FunctionToPointerDecay>
    |       | `-DeclRefExpr 0x7ffec3869e10 <col:28> 'unsigned int
(unsigned
int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned
int)'
    |       `-BinaryOperator 0x7ffec3869eb0 <col:32, col:36> 'unsigned
int'
'-'
    |         |-ImplicitCastExpr 0x7ffec3869e80 <col:32> 'unsigned
int'
<LValueToRValue>
    |         | `-DeclRefExpr 0x7ffec3869e38 <col:32> 'unsigned
int' lvalue
ParmVar 0x7ffec382d3d0 'n' 'unsigned int'
    |         `-ImplicitCastExpr 0x7ffec3869e98 <col:36> 'unsigned
int'
<IntegralCast>
    |           `-IntegerLiteral 0x7ffec3869e60 <col:36> 'int' 2
Is it either possible to improve the accuracy of the debug metadata, or
resolve the corresponding instruction in a different way? Ideally, I would
like to leave Clang untouched, i.e. not modify and recompile it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150423/4342a1de/attachment.html>
Eric Christopher
2015-Apr-23  01:20 UTC
[LLVMdev] Get precise line/column debug info from LLVM IR
Try upgrading :) dzur:~/tmp> ~/builds/build-llvm/Debug+Asserts/bin/clang -g -S -emit-llvm -o - foo.c | grep "\!22" call void @llvm.dbg.declare(metadata i32* %f, metadata !21, metadata !13), !dbg !22 store i32 %add, i32* %f, align 4, !dbg !22 !22 = !MDLocation(line: 5, column: 12, scope: !4) On Wed, Apr 22, 2015 at 6:13 PM Pablo González de Aledo < pablo.aledo at gmail.com> wrote:> I am trying to locate instructions in an LLVM Pass by line and column > number (reported by an third-party tool) to instrument them. To achieve > this, I am compiling my source files with `clang -g -O0 -emit-llvm` and > looking for the information in the metadata using this code: > > const DebugLoc &location = instruction->getDebugLoc(); > // location.getLine() > // location.getCol() > > Unfortunately, this information is absolutely imprecise. Consider the > following implementation of the Fibonacci function: > > unsigned fib(unsigned n) { > if (n < 2) > return n; > > unsigned f = fib(n - 1) + fib(n - 2); > return f; > } > > I would like to locate the single LLVM instruction corresponding to the > assignment `unsigned f = ...` in the resulting LLVM IR. I am not interested > in all the calculations of the right-hand side. The generated LLVM block > including relevant debug metadata is: > > [...] > > if.end: ; preds = %entry > call void @llvm.dbg.declare(metadata !{i32* %f}, metadata !17), !dbg > !18 > %2 = load i32* %n.addr, align 4, !dbg !19 > %sub = sub i32 %2, 1, !dbg !19 > %call = call i32 @fib(i32 %sub), !dbg !19 > %3 = load i32* %n.addr, align 4, !dbg !20 > %sub1 = sub i32 %3, 2, !dbg !20 > %call2 = call i32 @fib(i32 %sub1), !dbg !20 > %add = add i32 %call, %call2, !dbg !20 > store i32 %add, i32* %f, align 4, !dbg !20 > %4 = load i32* %f, align 4, !dbg !21 > store i32 %4, i32* %retval, !dbg !21 > br label %return, !dbg !21 > > [...] > > !17 = metadata !{i32 786688, metadata !4, metadata !"f", metadata !5, > i32 5, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [f] [line 5] > !18 = metadata !{i32 5, i32 11, metadata !4, null} > !19 = metadata !{i32 5, i32 15, metadata !4, null} > !20 = metadata !{i32 5, i32 28, metadata !4, null} > !21 = metadata !{i32 6, i32 2, metadata !4, null} > !22 = metadata !{i32 7, i32 1, metadata !4, null} > > As you can see, the metadata `!dbg !20` of the `store` instruction points > to **line 5 column 28**, which is the call to `fib(n - 2)`. Even worse, the > add operation and the subtraction `n - 2` both also point to that function > call, identified by `!dbg !20`. > > Interestingly, the Clang AST emitted by `clang -Xclang -ast-dump > -fsyntax-only` has all that information. Thus, I suspect that it is somehow > lost during the code generation phase. It seems that during code generation > Clang reaches some internal sequence point and associates all following > instructions to that position until the next sequence point (e.g. function > call) occurs. For completeness, here is the declaration statement in the > AST: > > |-DeclStmt 0x7ffec3869f48 <line:5:2, col:38> > | `-VarDecl 0x7ffec382d680 <col:2, col:37> col:11 used f 'unsigned > int' cinit > | `-BinaryOperator 0x7ffec3869f20 <col:15, col:37> 'unsigned int' '+' > | |-CallExpr 0x7ffec382d7e0 <col:15, col:24> 'unsigned int' > | | |-ImplicitCastExpr 0x7ffec382d7c8 <col:15> 'unsigned int > (*)(unsigned int)' <FunctionToPointerDecay> > | | | `-DeclRefExpr 0x7ffec382d6d8 <col:15> 'unsigned int > (unsigned int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned int)' > | | `-BinaryOperator 0x7ffec382d778 <col:19, col:23> 'unsigned > int' '-' > | | |-ImplicitCastExpr 0x7ffec382d748 <col:19> 'unsigned int' > <LValueToRValue> > | | | `-DeclRefExpr 0x7ffec382d700 <col:19> 'unsigned int' > lvalue ParmVar 0x7ffec382d3d0 'n' 'unsigned int' > | | `-ImplicitCastExpr 0x7ffec382d760 <col:23> 'unsigned int' > <IntegralCast> > | | `-IntegerLiteral 0x7ffec382d728 <col:23> 'int' 1 > | `-CallExpr 0x7ffec3869ef0 <col:28, col:37> 'unsigned int' > | |-ImplicitCastExpr 0x7ffec3869ed8 <col:28> 'unsigned int > (*)(unsigned int)' <FunctionToPointerDecay> > | | `-DeclRefExpr 0x7ffec3869e10 <col:28> 'unsigned int > (unsigned int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned int)' > | `-BinaryOperator 0x7ffec3869eb0 <col:32, col:36> 'unsigned > int' '-' > | |-ImplicitCastExpr 0x7ffec3869e80 <col:32> 'unsigned int' > <LValueToRValue> > | | `-DeclRefExpr 0x7ffec3869e38 <col:32> 'unsigned int' > lvalue ParmVar 0x7ffec382d3d0 'n' 'unsigned int' > | `-ImplicitCastExpr 0x7ffec3869e98 <col:36> 'unsigned int' > <IntegralCast> > | `-IntegerLiteral 0x7ffec3869e60 <col:36> 'int' 2 > > Is it either possible to improve the accuracy of the debug metadata, or > resolve the corresponding instruction in a different way? Ideally, I would > like to leave Clang untouched, i.e. not modify and recompile it. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150423/21f897c9/attachment.html>
Pablo González de Aledo
2015-Apr-23  01:39 UTC
[LLVMdev] Get precise line/column debug info from LLVM IR
Hi Eric, thanks for the quick answer. Which version are you using?. Here we tried it with (both MacOS): Apple LLVM version 6.1.0 (clang-602.0.49) (based on LLVM 3.6.0svn) clang version 3.5.1 (tags/RELEASE_351/final) Best regards. 2015-04-23 11:20 GMT+10:00 Eric Christopher <echristo at gmail.com>:> Try upgrading :) > > dzur:~/tmp> ~/builds/build-llvm/Debug+Asserts/bin/clang -g -S -emit-llvm > -o - foo.c | grep "\!22" > call void @llvm.dbg.declare(metadata i32* %f, metadata !21, metadata > !13), !dbg !22 > store i32 %add, i32* %f, align 4, !dbg !22 > !22 = !MDLocation(line: 5, column: 12, scope: !4) > > On Wed, Apr 22, 2015 at 6:13 PM Pablo González de Aledo < > pablo.aledo at gmail.com> wrote: > >> I am trying to locate instructions in an LLVM Pass by line and column >> number (reported by an third-party tool) to instrument them. To achieve >> this, I am compiling my source files with `clang -g -O0 -emit-llvm` and >> looking for the information in the metadata using this code: >> >> const DebugLoc &location = instruction->getDebugLoc(); >> // location.getLine() >> // location.getCol() >> >> Unfortunately, this information is absolutely imprecise. Consider the >> following implementation of the Fibonacci function: >> >> unsigned fib(unsigned n) { >> if (n < 2) >> return n; >> >> unsigned f = fib(n - 1) + fib(n - 2); >> return f; >> } >> >> I would like to locate the single LLVM instruction corresponding to the >> assignment `unsigned f = ...` in the resulting LLVM IR. I am not interested >> in all the calculations of the right-hand side. The generated LLVM block >> including relevant debug metadata is: >> >> [...] >> >> if.end: ; preds = %entry >> call void @llvm.dbg.declare(metadata !{i32* %f}, metadata !17), >> !dbg !18 >> %2 = load i32* %n.addr, align 4, !dbg !19 >> %sub = sub i32 %2, 1, !dbg !19 >> %call = call i32 @fib(i32 %sub), !dbg !19 >> %3 = load i32* %n.addr, align 4, !dbg !20 >> %sub1 = sub i32 %3, 2, !dbg !20 >> %call2 = call i32 @fib(i32 %sub1), !dbg !20 >> %add = add i32 %call, %call2, !dbg !20 >> store i32 %add, i32* %f, align 4, !dbg !20 >> %4 = load i32* %f, align 4, !dbg !21 >> store i32 %4, i32* %retval, !dbg !21 >> br label %return, !dbg !21 >> >> [...] >> >> !17 = metadata !{i32 786688, metadata !4, metadata !"f", metadata !5, >> i32 5, metadata !8, i32 0, i32 0} ; [ DW_TAG_auto_variable ] [f] [line 5] >> !18 = metadata !{i32 5, i32 11, metadata !4, null} >> !19 = metadata !{i32 5, i32 15, metadata !4, null} >> !20 = metadata !{i32 5, i32 28, metadata !4, null} >> !21 = metadata !{i32 6, i32 2, metadata !4, null} >> !22 = metadata !{i32 7, i32 1, metadata !4, null} >> >> As you can see, the metadata `!dbg !20` of the `store` instruction points >> to **line 5 column 28**, which is the call to `fib(n - 2)`. Even worse, the >> add operation and the subtraction `n - 2` both also point to that function >> call, identified by `!dbg !20`. >> >> Interestingly, the Clang AST emitted by `clang -Xclang -ast-dump >> -fsyntax-only` has all that information. Thus, I suspect that it is somehow >> lost during the code generation phase. It seems that during code generation >> Clang reaches some internal sequence point and associates all following >> instructions to that position until the next sequence point (e.g. function >> call) occurs. For completeness, here is the declaration statement in the >> AST: >> >> |-DeclStmt 0x7ffec3869f48 <line:5:2, col:38> >> | `-VarDecl 0x7ffec382d680 <col:2, col:37> col:11 used f 'unsigned >> int' cinit >> | `-BinaryOperator 0x7ffec3869f20 <col:15, col:37> 'unsigned int' >> '+' >> | |-CallExpr 0x7ffec382d7e0 <col:15, col:24> 'unsigned int' >> | | |-ImplicitCastExpr 0x7ffec382d7c8 <col:15> 'unsigned int >> (*)(unsigned int)' <FunctionToPointerDecay> >> | | | `-DeclRefExpr 0x7ffec382d6d8 <col:15> 'unsigned int >> (unsigned int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned int)' >> | | `-BinaryOperator 0x7ffec382d778 <col:19, col:23> 'unsigned >> int' '-' >> | | |-ImplicitCastExpr 0x7ffec382d748 <col:19> 'unsigned int' >> <LValueToRValue> >> | | | `-DeclRefExpr 0x7ffec382d700 <col:19> 'unsigned int' >> lvalue ParmVar 0x7ffec382d3d0 'n' 'unsigned int' >> | | `-ImplicitCastExpr 0x7ffec382d760 <col:23> 'unsigned int' >> <IntegralCast> >> | | `-IntegerLiteral 0x7ffec382d728 <col:23> 'int' 1 >> | `-CallExpr 0x7ffec3869ef0 <col:28, col:37> 'unsigned int' >> | |-ImplicitCastExpr 0x7ffec3869ed8 <col:28> 'unsigned int >> (*)(unsigned int)' <FunctionToPointerDecay> >> | | `-DeclRefExpr 0x7ffec3869e10 <col:28> 'unsigned int >> (unsigned int)' Function 0x7ffec382d490 'fib' 'unsigned int (unsigned int)' >> | `-BinaryOperator 0x7ffec3869eb0 <col:32, col:36> 'unsigned >> int' '-' >> | |-ImplicitCastExpr 0x7ffec3869e80 <col:32> 'unsigned int' >> <LValueToRValue> >> | | `-DeclRefExpr 0x7ffec3869e38 <col:32> 'unsigned int' >> lvalue ParmVar 0x7ffec382d3d0 'n' 'unsigned int' >> | `-ImplicitCastExpr 0x7ffec3869e98 <col:36> 'unsigned int' >> <IntegralCast> >> | `-IntegerLiteral 0x7ffec3869e60 <col:36> 'int' 2 >> >> Is it either possible to improve the accuracy of the debug metadata, or >> resolve the corresponding instruction in a different way? Ideally, I would >> like to leave Clang untouched, i.e. not modify and recompile it. >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> >-- _ = ( 255, lambda V ,B,c :c and Y(V*V+B,B, c -1)if(abs(V)<6)else ( 2+c-4*abs(V)**-0.4)/i ) ;v, x=1500,1000;C=range(v*x );import struct;P=struct.pack;M,\ j ='<QIIHHHH',open('M.bmp','wb').write for X in j('BM'+P(M,v*x*3+26,26,12,v,x,1,24))or C: i ,Y=_;j(P('BBB',*(lambda T:(T*80+T**9 *i-950*T **99,T*70-880*T**18+701* T **9 ,T*i**(1-T**45*2)))(sum( [ Y(0,(A%3/3.+X%v+(X/v+ A/3/3.-x/2)/1j)*2.5 /x -2.7,i)**2 for \ A in C [:9]]) /9) ) ) -. .---------------------------------------------------- Y ,, ,---, (_,\/_\_/_\ \.\_/_\_/> '-' '-' -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150423/fac56a4b/attachment.html>