Hi, All. Does anybody know about ConstantExpr in llvm? What's it? Since it always appears after llvm optimization such as -O2 level, what is it supposed to be to codegen? I am wondering it represents constant value which can be determined or computed at compile-time(actually is link-time) to improve performance. Although we do not know the actual constant value util the object file is linked. Here is a my example, but there is still existing code to compute value in run-time. cat a.C int n=5; int main(){ long a = (long)&n+7; int b = a; return b; } clang++ a.C -c -O2 -emit-llvm -S;cat a.ll ; ModuleID = 'a.C' target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.12.0" @n = global i32 5, align 4 ; Function Attrs: norecurse nounwind readnone ssp uwtable define i32 @main() #0 { ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 7) to i32) } clang++ a.C -c -O2;objdump -d a.O a.O: file format Mach-O 64-bit x86-64 Disassembly of section __TEXT,__text: _main: 0: 55 pushq %rbp 1: 48 89 e5 movq %rsp, %rbp 4: 48 8d 05 00 00 00 00 leaq (%rip), %rax b: 83 c0 07 *addl $7, %eax* e: 5d popq %rbp f: c3 retq I am confused about what is its functionality in llvm? Thanks. --------- Zeson -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170309/cc02b2be/attachment.html>
On 3/9/17 8:28 AM, Zeson Wu via llvm-dev wrote:> Hi, All. > > Does anybody know about ConstantExpr in llvm? What's it? > Since it always appears after llvm optimization such as -O2 level, > what is it supposed to be to codegen? I am wondering it represents > constant value which can be determined or computed at > compile-time(actually is link-time) to improve performance. Although > we do not know the actual constant value util the object file is linked.You're pretty much got it. A Constant Expression (ConstantExpr) is simply a constant value. Since some constant values depend upon architecture-dependent features (e.g., structure layout, pointer size, etc.), LLVM provides the ConstantExpr to represent them in a (more or less) architecture-independent way. For example, a GEP with constant indices on an internal global variable will always compute the same value; it is a constant. However, we use a GEP ConstantExpr to represent it; the backend code generator converts it to the appropriate numerical constant when generating native code. For more information on the ConstantExpr, please see the LLVM Language Reference Manual (http://llvm.org/docs/LangRef.html#constant-expressions). Regards, John Criswell> > Here is a my example, but there is still existing code to compute > value in run-time. > > cat a.C > int n=5; > int main(){ > long a = (long)&n+7; > int b = a; > return b; > } > > clang++ a.C -c -O2 -emit-llvm -S;cat a.ll > ; ModuleID = 'a.C' > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-apple-macosx10.12.0" > > @n = global i32 5, align 4 > > ; Function Attrs: norecurse nounwind readnone ssp uwtable > define i32 @main() #0 { > ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 7) to i32) > } > > clang++ a.C -c -O2;objdump -d a.O > > a.O:file format Mach-O 64-bit x86-64 > > Disassembly of section __TEXT,__text: > _main: > 0:55 pushq%rbp > 1:48 89 e5 movq%rsp, %rbp > 4:48 8d 05 00 00 00 00 leaq(%rip), %rax > b:83 c0 07 *addl$7, %eax* > e:5d popq%rbp > f:c3 retq > > I am confused about what is its functionality in llvm? > > Thanks. > --------- > > Zeson > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170309/d6f0bb74/attachment.html>
Still thanks John. Another example is that int a; int main(){ return 5+(long)(&a); } In O0 mode, IR is like blow @a = global i32 0, align 4 ; Function Attrs: noinline norecurse nounwind define signext i32 @main() #0 { %1 = alloca i32, align 4 store i32 0, i32* %1, align 4 ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) } In O2 mode, IR is optimized as blow. @a = global i32 0, align 4 ; Function Attrs: norecurse nounwind readnone define signext i32 @main() local_unnamed_addr #0 { ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) } I mean what's the advantage of that pattern with constantexpr since it is introduced in O2 mode? How does back end handle this pattern (which is a bitcast operator in my last case in email before)? Thanks. 2017-03-11 0:32 GMT+08:00 John Criswell <jtcriswel at gmail.com>:> On 3/9/17 11:28 PM, Zeson Wu wrote: > > OK. Could you give a specific example to illustrate what you are talking > about and some comparison would be better. > > Here is a example of ConstantExpr of BitCast. > > target datalayout = "e-m:e-i64:64-n32:64" > target triple = "powerpc64le-unknown-linux-gnu" > > ; Function Attrs: norecurse > define signext i32 @main() local_unnamed_addr #0 { > entry: > %call = tail call signext i32 @f([4 x i128] [i128 0, i128 undef, i128 > 1334440659191554973911497130241, i128 undef], [1 x i128] [i128 bitcast > (<16 x i8> <i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 > 2, i8 2, i8 2, i8 2, i8 2, i8 2> to i128)], i32 signext 16) > ret i32 %call > } > > > declare signext i32 @f([4 x i128], [1 x i128], i32 signext) > local_unnamed_addr #1 > > I am thinking about why the result of bitcast is not computed directly > after O2 optimization. What is the backend supposed to handle this > constantexpr? Is there any advantage to do this? > > > The bitcast exists because the LLVM IR has type information; when you > bitcast a constant from one type to another, its value doesn't change, but > its static type does. It's essentially a way of expressing the same > constant but with a different static type. > > Also, please keep conversations on the list instead of emailing me > directly. That way, others can benefit from the conversation and can chime > in if needed. > > Regards, > > John Criswell > > > > 2017-03-09 22:39 GMT+08:00 John Criswell <jtcriswel at gmail.com>: > >> On 3/9/17 8:28 AM, Zeson Wu via llvm-dev wrote: >> >> Hi, All. >> >> Does anybody know about ConstantExpr in llvm? What's it? >> Since it always appears after llvm optimization such as -O2 level, what >> is it supposed to be to codegen? I am wondering it represents constant >> value which can be determined or computed at compile-time(actually is >> link-time) to improve performance. Although we do not know the actual >> constant value util the object file is linked. >> >> >> You're pretty much got it. A Constant Expression (ConstantExpr) is >> simply a constant value. Since some constant values depend upon >> architecture-dependent features (e.g., structure layout, pointer size, >> etc.), LLVM provides the ConstantExpr to represent them in a (more or less) >> architecture-independent way. >> >> For example, a GEP with constant indices on an internal global variable >> will always compute the same value; it is a constant. However, we use a >> GEP ConstantExpr to represent it; the backend code generator converts it to >> the appropriate numerical constant when generating native code. >> >> For more information on the ConstantExpr, please see the LLVM Language >> Reference Manual (http://llvm.org/docs/LangRef.html#constant-expressions >> ). >> >> Regards, >> >> John Criswell >> >> >> Here is a my example, but there is still existing code to compute value >> in run-time. >> >> cat a.C >> int n=5; >> int main(){ >> long a = (long)&n+7; >> int b = a; >> return b; >> } >> >> clang++ a.C -c -O2 -emit-llvm -S;cat a.ll >> ; ModuleID = 'a.C' >> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" >> target triple = "x86_64-apple-macosx10.12.0" >> >> @n = global i32 5, align 4 >> >> ; Function Attrs: norecurse nounwind readnone ssp uwtable >> define i32 @main() #0 { >> ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 7) to i32) >> } >> >> clang++ a.C -c -O2;objdump -d a.O >> >> a.O: file format Mach-O 64-bit x86-64 >> >> Disassembly of section __TEXT,__text: >> _main: >> 0: 55 pushq %rbp >> 1: 48 89 e5 movq %rsp, %rbp >> 4: 48 8d 05 00 00 00 00 leaq (%rip), %rax >> b: 83 c0 07 *addl $7, %eax* >> e: 5d popq %rbp >> f: c3 retq >> >> I am confused about what is its functionality in llvm? >> >> Thanks. >> --------- >> >> Zeson >> >> >> _______________________________________________ >> LLVM Developers mailing listllvm-dev at lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> -- >> John Criswell >> Assistant Professor >> Department of Computer Science, University of Rochesterhttp://www.cs.rochester.edu/u/criswell >> >> -- > Zeson > > -- > John Criswell > Assistant Professor > Department of Computer Science, University of Rochesterhttp://www.cs.rochester.edu/u/criswell > >-- Zeson -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170317/c6e5898c/attachment-0001.html>
On 3/17/17 6:32 AM, Zeson Wu wrote:> Still thanks John. > > Another example is that > > int a; > > int main(){ > return 5+(long)(&a); > } > > In O0 mode, IR is like blow > > @a = global i32 0, align 4 > > ; Function Attrs: noinline norecurse nounwind > define signext i32 @main() #0 { > %1 = alloca i32, align 4 > store i32 0, i32* %1, align 4 > ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) > } > > In O2 mode, IR is optimized as blow. > @a = global i32 0, align 4 > > ; Function Attrs: norecurse nounwind readnone > define signext i32 @main() local_unnamed_addr #0 { > ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) > } > > I mean what's the advantage of that pattern with constantexpr since it > is introduced in O2 mode?I see the constant expression in both the -O0 and -O2 assembly files above; I don't think the optimizations are adding it to the code. The advantage of the constant expression in this case is that LLVM can express the constant in an architecture-independent way. First, you can't add a pointer to an integer, so a constant bitcast is needed. Second, even if the bitcast wasn't necessary, the value of "@a" hasn't been determined yet (as code generation hasn't occurred yet), so there is a need of representing the constant symbolically. This is why constant expressions exist: they allow constants to be represented symbolically, and they allow for constants to be represented in a way that can type check.> How does back end handle this pattern (which is a bitcast operator in > my last case in email before)?After picking the location of the global variable "a", the backend should be able to simplify the constant expression into a single integer value. Use clang -S to see the assembly code that LLVM generates; I suspect you'll see the constant expression simplified to a single constant value. Regards, John Criswell> > Thanks. > > > 2017-03-11 0:32 GMT+08:00 John Criswell <jtcriswel at gmail.com > <mailto:jtcriswel at gmail.com>>: > > On 3/9/17 11:28 PM, Zeson Wu wrote: >> OK. Could you give a specific example to illustrate what you are >> talking about and some comparison would be better. >> >> Here is a example of ConstantExpr of BitCast. >> >> target datalayout = "e-m:e-i64:64-n32:64" >> target triple = "powerpc64le-unknown-linux-gnu" >> >> ; Function Attrs: norecurse >> define signext i32 @main() local_unnamed_addr #0 { >> entry: >> %call = tail call signext i32 @f([4 x i128] [i128 0, i128 >> undef, i128 1334440659191554973911497130241, i128 undef], [1 x >> i128] [i128 bitcast (<16 x i8> <i8 2, i8 2, i8 2, i8 2, i8 2, i8 >> 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2> to >> i128)], i32 signext 16) >> ret i32 %call >> } >> >> >> declare signext i32 @f([4 x i128], [1 x i128], i32 signext) >> local_unnamed_addr #1 >> >> I am thinking about why the result of bitcast is not computed >> directly after O2 optimization. What is the backend supposed to >> handle this constantexpr? Is there any advantage to do this? > > The bitcast exists because the LLVM IR has type information; when > you bitcast a constant from one type to another, its value doesn't > change, but its static type does. It's essentially a way of > expressing the same constant but with a different static type. > > Also, please keep conversations on the list instead of emailing me > directly. That way, others can benefit from the conversation and > can chime in if needed. > > Regards, > > John Criswell > >> >> >> 2017-03-09 22:39 GMT+08:00 John Criswell <jtcriswel at gmail.com >> <mailto:jtcriswel at gmail.com>>: >> >> On 3/9/17 8:28 AM, Zeson Wu via llvm-dev wrote: >>> Hi, All. >>> >>> Does anybody know about ConstantExpr in llvm? What's it? >>> Since it always appears after llvm optimization such as -O2 >>> level, what is it supposed to be to codegen? I am wondering >>> it represents constant value which can be determined or >>> computed at compile-time(actually is link-time) to improve >>> performance. Although we do not know the actual constant >>> value util the object file is linked. >> >> You're pretty much got it. A Constant Expression >> (ConstantExpr) is simply a constant value. Since some >> constant values depend upon architecture-dependent features >> (e.g., structure layout, pointer size, etc.), LLVM provides >> the ConstantExpr to represent them in a (more or less) >> architecture-independent way. >> >> For example, a GEP with constant indices on an internal >> global variable will always compute the same value; it is a >> constant. However, we use a GEP ConstantExpr to represent >> it; the backend code generator converts it to the appropriate >> numerical constant when generating native code. >> >> For more information on the ConstantExpr, please see the LLVM >> Language Reference Manual >> (http://llvm.org/docs/LangRef.html#constant-expressions >> <http://llvm.org/docs/LangRef.html#constant-expressions>). >> >> Regards, >> >> John Criswell >> >>> >>> Here is a my example, but there is still existing code to >>> compute value in run-time. >>> >>> cat a.C >>> int n=5; >>> int main(){ >>> long a = (long)&n+7; >>> int b = a; >>> return b; >>> } >>> >>> clang++ a.C -c -O2 -emit-llvm -S;cat a.ll >>> ; ModuleID = 'a.C' >>> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" >>> target triple = "x86_64-apple-macosx10.12.0" >>> >>> @n = global i32 5, align 4 >>> >>> ; Function Attrs: norecurse nounwind readnone ssp uwtable >>> define i32 @main() #0 { >>> ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 >>> 7) to i32) >>> } >>> >>> clang++ a.C -c -O2;objdump -d a.O >>> >>> a.O:file format Mach-O 64-bit x86-64 >>> >>> Disassembly of section __TEXT,__text: >>> _main: >>> 0:55 pushq%rbp >>> 1:48 89 e5 movq%rsp, %rbp >>> 4:48 8d 05 00 00 00 00 leaq(%rip), %rax >>> b:83 c0 07 *addl$7, %eax* >>> e:5d popq%rbp >>> f:c3 retq >>> >>> I am confused about what is its functionality in llvm? >>> >>> Thanks. >>> --------- >>> >>> Zeson >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> -- >> John Criswell >> Assistant Professor >> Department of Computer Science, University of Rochester >> http://www.cs.rochester.edu/u/criswell >> <http://www.cs.rochester.edu/u/criswell> >> >> -- >> Zeson > > -- > John Criswell > Assistant Professor > Department of Computer Science, University of Rochester > http://www.cs.rochester.edu/u/criswell > <http://www.cs.rochester.edu/u/criswell> > > -- > Zeson-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170317/d4ed02a7/attachment.html>
> On Mar 9, 2017, at 5:28 AM, Zeson Wu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, All. > > Does anybody know about ConstantExpr in llvm? What's it?The short version is: Some values can stand on their own in llvm independently of a basic block or a function. These include things like numbers, addresses of global variables or functions, etc. You can even do computations on them giving you ConstantExpr. Take for example: int array[10]; int *x = &array + 5; giving this llvm IR: @array = common global [10 x i32] zeroinitializer, align 16 @x = global i32* bitcast (i8* getelementptr (i8, i8* bitcast ([10 x i32]* @array to i8*), i64 20) to i32*), align 8 and this assembly: .globl _x ## @x .p2align 3 _x: .quad _array+20 They will be lowered at various places (some in the backend, some by the linker, some by the dynamic loader) but will be a constant value when the program is loaded. - Matthias
Yes, but the example you give is a global variable. How about local variable which is also folded into constant expression as blow. @a = global i32 0, align 4 define signext i32 @main() { ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) } I found the computation still exists in assembly code which as blow. Could it be that a constant replaces the result of trunc operation theoretically? Does it need a assembler(or linker?) to do the process to calculate the result on the variable `a` such as `.quad trunc_macro(a+5)` as a temp symbol value used by moving it to eax register to return? I am wondering what is the difference between constant expression and separate instruction? For example, define signext i32 @main() { %1 = ptrtoint i32* @a to i64 %2 = add i64 %1, 5 %3 = trunc i64 %2 to i32 ret i32 %3 } The code above would be only folded in optimized mode such as -O1 or -O2. define signext i32 @main() #0 { ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) } 2017-03-18 7:04 GMT+08:00 Matthias Braun <mbraun at apple.com>:> > > On Mar 9, 2017, at 5:28 AM, Zeson Wu via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > Hi, All. > > > > Does anybody know about ConstantExpr in llvm? What's it? > The short version is: Some values can stand on their own in llvm > independently of a basic block or a function. These include things like > numbers, addresses of global variables or functions, etc. You can even do > computations on them giving you ConstantExpr. Take for example: > > int array[10]; > int *x = &array + 5; > > giving this llvm IR: > > @array = common global [10 x i32] zeroinitializer, align 16 > @x = global i32* bitcast (i8* getelementptr (i8, i8* bitcast ([10 x i32]* > @array to i8*), i64 20) to i32*), align 8 > > and this assembly: > > .globl _x ## @x > .p2align 3 > _x: > .quad _array+20 > > They will be lowered at various places (some in the backend, some by the > linker, some by the dynamic loader) but will be a constant value when the > program is loaded. > > - Matthias >-- Zeson -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170319/98b0bdcb/attachment-0001.html>