Hi, All.
Does anybody know about ConstantExpr in llvm? What's it?
Since it always appears after llvm optimization such as -O2 level, what is
it supposed to be to codegen? I am wondering it represents constant value
which can be determined or computed at compile-time(actually is link-time)
to improve performance. Although we do not know the actual constant value
util the object file is linked.
Here is a my example, but there is still existing code to compute value in
run-time.
cat a.C
int n=5;
int main(){
long a = (long)&n+7;
int b = a;
return b;
}
clang++ a.C -c -O2 -emit-llvm -S;cat a.ll
; ModuleID = 'a.C'
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.12.0"
@n = global i32 5, align 4
; Function Attrs: norecurse nounwind readnone ssp uwtable
define i32 @main() #0 {
ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 7) to i32)
}
clang++ a.C -c -O2;objdump -d a.O
a.O: file format Mach-O 64-bit x86-64
Disassembly of section __TEXT,__text:
_main:
0: 55 pushq %rbp
1: 48 89 e5 movq %rsp, %rbp
4: 48 8d 05 00 00 00 00 leaq (%rip), %rax
b: 83 c0 07 *addl $7, %eax*
e: 5d popq %rbp
f: c3 retq
I am confused about what is its functionality in llvm?
Thanks.
---------
Zeson
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170309/cc02b2be/attachment.html>
On 3/9/17 8:28 AM, Zeson Wu via llvm-dev wrote:> Hi, All. > > Does anybody know about ConstantExpr in llvm? What's it? > Since it always appears after llvm optimization such as -O2 level, > what is it supposed to be to codegen? I am wondering it represents > constant value which can be determined or computed at > compile-time(actually is link-time) to improve performance. Although > we do not know the actual constant value util the object file is linked.You're pretty much got it. A Constant Expression (ConstantExpr) is simply a constant value. Since some constant values depend upon architecture-dependent features (e.g., structure layout, pointer size, etc.), LLVM provides the ConstantExpr to represent them in a (more or less) architecture-independent way. For example, a GEP with constant indices on an internal global variable will always compute the same value; it is a constant. However, we use a GEP ConstantExpr to represent it; the backend code generator converts it to the appropriate numerical constant when generating native code. For more information on the ConstantExpr, please see the LLVM Language Reference Manual (http://llvm.org/docs/LangRef.html#constant-expressions). Regards, John Criswell> > Here is a my example, but there is still existing code to compute > value in run-time. > > cat a.C > int n=5; > int main(){ > long a = (long)&n+7; > int b = a; > return b; > } > > clang++ a.C -c -O2 -emit-llvm -S;cat a.ll > ; ModuleID = 'a.C' > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-apple-macosx10.12.0" > > @n = global i32 5, align 4 > > ; Function Attrs: norecurse nounwind readnone ssp uwtable > define i32 @main() #0 { > ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 7) to i32) > } > > clang++ a.C -c -O2;objdump -d a.O > > a.O:file format Mach-O 64-bit x86-64 > > Disassembly of section __TEXT,__text: > _main: > 0:55 pushq%rbp > 1:48 89 e5 movq%rsp, %rbp > 4:48 8d 05 00 00 00 00 leaq(%rip), %rax > b:83 c0 07 *addl$7, %eax* > e:5d popq%rbp > f:c3 retq > > I am confused about what is its functionality in llvm? > > Thanks. > --------- > > Zeson > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170309/d6f0bb74/attachment.html>
Still thanks John.
Another example is that
int a;
int main(){
return 5+(long)(&a);
}
In O0 mode, IR is like blow
@a = global i32 0, align 4
; Function Attrs: noinline norecurse nounwind
define signext i32 @main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1, align 4
ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32)
}
In O2 mode, IR is optimized as blow.
@a = global i32 0, align 4
; Function Attrs: norecurse nounwind readnone
define signext i32 @main() local_unnamed_addr #0 {
ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32)
}
I mean what's the advantage of that pattern with constantexpr since it is
introduced in O2 mode?
How does back end handle this pattern (which is a bitcast operator in my
last case in email before)?
Thanks.
2017-03-11 0:32 GMT+08:00 John Criswell <jtcriswel at gmail.com>:
> On 3/9/17 11:28 PM, Zeson Wu wrote:
>
> OK. Could you give a specific example to illustrate what you are talking
> about and some comparison would be better.
>
> Here is a example of ConstantExpr of BitCast.
>
> target datalayout = "e-m:e-i64:64-n32:64"
> target triple = "powerpc64le-unknown-linux-gnu"
>
> ; Function Attrs: norecurse
> define signext i32 @main() local_unnamed_addr #0 {
> entry:
> %call = tail call signext i32 @f([4 x i128] [i128 0, i128 undef, i128
> 1334440659191554973911497130241, i128 undef], [1 x i128] [i128 bitcast
> (<16 x i8> <i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2,
i8 2, i8
> 2, i8 2, i8 2, i8 2, i8 2, i8 2> to i128)], i32 signext 16)
> ret i32 %call
> }
>
>
> declare signext i32 @f([4 x i128], [1 x i128], i32 signext)
> local_unnamed_addr #1
>
> I am thinking about why the result of bitcast is not computed directly
> after O2 optimization. What is the backend supposed to handle this
> constantexpr? Is there any advantage to do this?
>
>
> The bitcast exists because the LLVM IR has type information; when you
> bitcast a constant from one type to another, its value doesn't change,
but
> its static type does. It's essentially a way of expressing the same
> constant but with a different static type.
>
> Also, please keep conversations on the list instead of emailing me
> directly. That way, others can benefit from the conversation and can chime
> in if needed.
>
> Regards,
>
> John Criswell
>
>
>
> 2017-03-09 22:39 GMT+08:00 John Criswell <jtcriswel at gmail.com>:
>
>> On 3/9/17 8:28 AM, Zeson Wu via llvm-dev wrote:
>>
>> Hi, All.
>>
>> Does anybody know about ConstantExpr in llvm? What's it?
>> Since it always appears after llvm optimization such as -O2 level, what
>> is it supposed to be to codegen? I am wondering it represents constant
>> value which can be determined or computed at compile-time(actually is
>> link-time) to improve performance. Although we do not know the actual
>> constant value util the object file is linked.
>>
>>
>> You're pretty much got it. A Constant Expression (ConstantExpr) is
>> simply a constant value. Since some constant values depend upon
>> architecture-dependent features (e.g., structure layout, pointer size,
>> etc.), LLVM provides the ConstantExpr to represent them in a (more or
less)
>> architecture-independent way.
>>
>> For example, a GEP with constant indices on an internal global variable
>> will always compute the same value; it is a constant. However, we use
a
>> GEP ConstantExpr to represent it; the backend code generator converts
it to
>> the appropriate numerical constant when generating native code.
>>
>> For more information on the ConstantExpr, please see the LLVM Language
>> Reference Manual
(http://llvm.org/docs/LangRef.html#constant-expressions
>> ).
>>
>> Regards,
>>
>> John Criswell
>>
>>
>> Here is a my example, but there is still existing code to compute value
>> in run-time.
>>
>> cat a.C
>> int n=5;
>> int main(){
>> long a = (long)&n+7;
>> int b = a;
>> return b;
>> }
>>
>> clang++ a.C -c -O2 -emit-llvm -S;cat a.ll
>> ; ModuleID = 'a.C'
>> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
>> target triple = "x86_64-apple-macosx10.12.0"
>>
>> @n = global i32 5, align 4
>>
>> ; Function Attrs: norecurse nounwind readnone ssp uwtable
>> define i32 @main() #0 {
>> ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 7) to i32)
>> }
>>
>> clang++ a.C -c -O2;objdump -d a.O
>>
>> a.O: file format Mach-O 64-bit x86-64
>>
>> Disassembly of section __TEXT,__text:
>> _main:
>> 0: 55 pushq %rbp
>> 1: 48 89 e5 movq %rsp, %rbp
>> 4: 48 8d 05 00 00 00 00 leaq (%rip), %rax
>> b: 83 c0 07 *addl $7, %eax*
>> e: 5d popq %rbp
>> f: c3 retq
>>
>> I am confused about what is its functionality in llvm?
>>
>> Thanks.
>> ---------
>>
>> Zeson
>>
>>
>> _______________________________________________
>> LLVM Developers mailing listllvm-dev at
lists.llvm.orghttp://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>> --
>> John Criswell
>> Assistant Professor
>> Department of Computer Science, University of
Rochesterhttp://www.cs.rochester.edu/u/criswell
>>
>> --
> Zeson
>
> --
> John Criswell
> Assistant Professor
> Department of Computer Science, University of
Rochesterhttp://www.cs.rochester.edu/u/criswell
>
>
--
Zeson
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170317/c6e5898c/attachment-0001.html>
On 3/17/17 6:32 AM, Zeson Wu wrote:> Still thanks John. > > Another example is that > > int a; > > int main(){ > return 5+(long)(&a); > } > > In O0 mode, IR is like blow > > @a = global i32 0, align 4 > > ; Function Attrs: noinline norecurse nounwind > define signext i32 @main() #0 { > %1 = alloca i32, align 4 > store i32 0, i32* %1, align 4 > ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) > } > > In O2 mode, IR is optimized as blow. > @a = global i32 0, align 4 > > ; Function Attrs: norecurse nounwind readnone > define signext i32 @main() local_unnamed_addr #0 { > ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32) > } > > I mean what's the advantage of that pattern with constantexpr since it > is introduced in O2 mode?I see the constant expression in both the -O0 and -O2 assembly files above; I don't think the optimizations are adding it to the code. The advantage of the constant expression in this case is that LLVM can express the constant in an architecture-independent way. First, you can't add a pointer to an integer, so a constant bitcast is needed. Second, even if the bitcast wasn't necessary, the value of "@a" hasn't been determined yet (as code generation hasn't occurred yet), so there is a need of representing the constant symbolically. This is why constant expressions exist: they allow constants to be represented symbolically, and they allow for constants to be represented in a way that can type check.> How does back end handle this pattern (which is a bitcast operator in > my last case in email before)?After picking the location of the global variable "a", the backend should be able to simplify the constant expression into a single integer value. Use clang -S to see the assembly code that LLVM generates; I suspect you'll see the constant expression simplified to a single constant value. Regards, John Criswell> > Thanks. > > > 2017-03-11 0:32 GMT+08:00 John Criswell <jtcriswel at gmail.com > <mailto:jtcriswel at gmail.com>>: > > On 3/9/17 11:28 PM, Zeson Wu wrote: >> OK. Could you give a specific example to illustrate what you are >> talking about and some comparison would be better. >> >> Here is a example of ConstantExpr of BitCast. >> >> target datalayout = "e-m:e-i64:64-n32:64" >> target triple = "powerpc64le-unknown-linux-gnu" >> >> ; Function Attrs: norecurse >> define signext i32 @main() local_unnamed_addr #0 { >> entry: >> %call = tail call signext i32 @f([4 x i128] [i128 0, i128 >> undef, i128 1334440659191554973911497130241, i128 undef], [1 x >> i128] [i128 bitcast (<16 x i8> <i8 2, i8 2, i8 2, i8 2, i8 2, i8 >> 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2, i8 2> to >> i128)], i32 signext 16) >> ret i32 %call >> } >> >> >> declare signext i32 @f([4 x i128], [1 x i128], i32 signext) >> local_unnamed_addr #1 >> >> I am thinking about why the result of bitcast is not computed >> directly after O2 optimization. What is the backend supposed to >> handle this constantexpr? Is there any advantage to do this? > > The bitcast exists because the LLVM IR has type information; when > you bitcast a constant from one type to another, its value doesn't > change, but its static type does. It's essentially a way of > expressing the same constant but with a different static type. > > Also, please keep conversations on the list instead of emailing me > directly. That way, others can benefit from the conversation and > can chime in if needed. > > Regards, > > John Criswell > >> >> >> 2017-03-09 22:39 GMT+08:00 John Criswell <jtcriswel at gmail.com >> <mailto:jtcriswel at gmail.com>>: >> >> On 3/9/17 8:28 AM, Zeson Wu via llvm-dev wrote: >>> Hi, All. >>> >>> Does anybody know about ConstantExpr in llvm? What's it? >>> Since it always appears after llvm optimization such as -O2 >>> level, what is it supposed to be to codegen? I am wondering >>> it represents constant value which can be determined or >>> computed at compile-time(actually is link-time) to improve >>> performance. Although we do not know the actual constant >>> value util the object file is linked. >> >> You're pretty much got it. A Constant Expression >> (ConstantExpr) is simply a constant value. Since some >> constant values depend upon architecture-dependent features >> (e.g., structure layout, pointer size, etc.), LLVM provides >> the ConstantExpr to represent them in a (more or less) >> architecture-independent way. >> >> For example, a GEP with constant indices on an internal >> global variable will always compute the same value; it is a >> constant. However, we use a GEP ConstantExpr to represent >> it; the backend code generator converts it to the appropriate >> numerical constant when generating native code. >> >> For more information on the ConstantExpr, please see the LLVM >> Language Reference Manual >> (http://llvm.org/docs/LangRef.html#constant-expressions >> <http://llvm.org/docs/LangRef.html#constant-expressions>). >> >> Regards, >> >> John Criswell >> >>> >>> Here is a my example, but there is still existing code to >>> compute value in run-time. >>> >>> cat a.C >>> int n=5; >>> int main(){ >>> long a = (long)&n+7; >>> int b = a; >>> return b; >>> } >>> >>> clang++ a.C -c -O2 -emit-llvm -S;cat a.ll >>> ; ModuleID = 'a.C' >>> target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" >>> target triple = "x86_64-apple-macosx10.12.0" >>> >>> @n = global i32 5, align 4 >>> >>> ; Function Attrs: norecurse nounwind readnone ssp uwtable >>> define i32 @main() #0 { >>> ret i32 trunc (i64 add (i64 ptrtoint (i32* @n to i64), i64 >>> 7) to i32) >>> } >>> >>> clang++ a.C -c -O2;objdump -d a.O >>> >>> a.O:file format Mach-O 64-bit x86-64 >>> >>> Disassembly of section __TEXT,__text: >>> _main: >>> 0:55 pushq%rbp >>> 1:48 89 e5 movq%rsp, %rbp >>> 4:48 8d 05 00 00 00 00 leaq(%rip), %rax >>> b:83 c0 07 *addl$7, %eax* >>> e:5d popq%rbp >>> f:c3 retq >>> >>> I am confused about what is its functionality in llvm? >>> >>> Thanks. >>> --------- >>> >>> Zeson >>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> -- >> John Criswell >> Assistant Professor >> Department of Computer Science, University of Rochester >> http://www.cs.rochester.edu/u/criswell >> <http://www.cs.rochester.edu/u/criswell> >> >> -- >> Zeson > > -- > John Criswell > Assistant Professor > Department of Computer Science, University of Rochester > http://www.cs.rochester.edu/u/criswell > <http://www.cs.rochester.edu/u/criswell> > > -- > Zeson-- John Criswell Assistant Professor Department of Computer Science, University of Rochester http://www.cs.rochester.edu/u/criswell -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170317/d4ed02a7/attachment.html>
> On Mar 9, 2017, at 5:28 AM, Zeson Wu via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, All. > > Does anybody know about ConstantExpr in llvm? What's it?The short version is: Some values can stand on their own in llvm independently of a basic block or a function. These include things like numbers, addresses of global variables or functions, etc. You can even do computations on them giving you ConstantExpr. Take for example: int array[10]; int *x = &array + 5; giving this llvm IR: @array = common global [10 x i32] zeroinitializer, align 16 @x = global i32* bitcast (i8* getelementptr (i8, i8* bitcast ([10 x i32]* @array to i8*), i64 20) to i32*), align 8 and this assembly: .globl _x ## @x .p2align 3 _x: .quad _array+20 They will be lowered at various places (some in the backend, some by the linker, some by the dynamic loader) but will be a constant value when the program is loaded. - Matthias
Yes, but the example you give is a global variable. How about local
variable which is also folded into constant expression as blow.
@a = global i32 0, align 4
define signext i32 @main() {
ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32)
}
I found the computation still exists in assembly code which as blow. Could
it be that a constant replaces the result of trunc operation theoretically?
Does it need a assembler(or linker?) to do the process to calculate the
result on the variable `a` such as `.quad trunc_macro(a+5)` as a temp
symbol value used by moving it to eax register to return?
I am wondering what is the difference between constant expression and
separate instruction?
For example,
define signext i32 @main() {
%1 = ptrtoint i32* @a to i64
%2 = add i64 %1, 5
%3 = trunc i64 %2 to i32
ret i32 %3
}
The code above would be only folded in optimized mode such as -O1 or -O2.
define signext i32 @main() #0 {
ret i32 trunc (i64 add (i64 ptrtoint (i32* @a to i64), i64 5) to i32)
}
2017-03-18 7:04 GMT+08:00 Matthias Braun <mbraun at apple.com>:
>
> > On Mar 9, 2017, at 5:28 AM, Zeson Wu via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > Hi, All.
> >
> > Does anybody know about ConstantExpr in llvm? What's it?
> The short version is: Some values can stand on their own in llvm
> independently of a basic block or a function. These include things like
> numbers, addresses of global variables or functions, etc. You can even do
> computations on them giving you ConstantExpr. Take for example:
>
> int array[10];
> int *x = &array + 5;
>
> giving this llvm IR:
>
> @array = common global [10 x i32] zeroinitializer, align 16
> @x = global i32* bitcast (i8* getelementptr (i8, i8* bitcast ([10 x i32]*
> @array to i8*), i64 20) to i32*), align 8
>
> and this assembly:
>
> .globl _x ## @x
> .p2align 3
> _x:
> .quad _array+20
>
> They will be lowered at various places (some in the backend, some by the
> linker, some by the dynamic loader) but will be a constant value when the
> program is loaded.
>
> - Matthias
>
--
Zeson
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170319/98b0bdcb/attachment-0001.html>