thr3ads.net - llvm dev - [llvm-dev] Optimization generate super long function definition [Apr 2020]

If this information is useful, please help other people find it:
Share via:

Jason -Zhong Sheng- Hu via llvm-dev

2020-Apr-12 22:49 UTC

[llvm-dev] Optimization generate super long function definition

Hi all,


sorry to have sent the same question around. I am quite desperately looking for
a solution to this problem and I figured the mailing list is the best bet.


In my code, I generate the following function:

define i32 @gl.qi([500 x i32] %x, i32 %i) {
entry:
  %x. = alloca [500 x i32]
  %i. = alloca i32
  %0 = alloca [500 x i32]
  store [500 x i32] %x, [500 x i32]* %x.
  store i32 %i, i32* %i.
  %x.1 = load [500 x i32], [500 x i32]* %x.
  %i.2 = load i32, i32* %i.
  store [500 x i32] %x.1, [500 x i32]* %0
  %1 = icmp slt i32 %i.2, 500
  br i1 %1, label %in-bound, label %out-of-bound

out-of-bound:                                     ; preds = %entry
  call void @gen.panic(i8* getelementptr inbounds ([22 x i8], [22 x i8]*
@pool.str.2, i32 0, i32 0))
  unreachable

in-bound:                                         ; preds = %entry
  %2 = getelementptr inbounds [500 x i32], [500 x i32]* %0, i32 0, i32 %i.2
  %idx = load i32, i32* %2
  ret i32 %idx
}


the high level functionality is to use %i to index %x, and if %i is out of
bound, a panic function is called instead.


consider the store line:

  store [500 x i32] %x, [500 x i32]* %x.


once I pass this function to opt -O1 -S --verify --verify-each, it generates
code like this:

define i32 @gl.qi([500 x i32] %x, i32 %i) local_unnamed_addr {
entry:
  %0 = alloca [500 x i32], align 4
  %x.fca.0.extract = extractvalue [500 x i32] %x, 0
  %x.fca.1.extract = extractvalue [500 x i32] %x, 1
  %x.fca.2.extract = extractvalue [500 x i32] %x, 2
  %x.fca.3.extract = extractvalue [500 x i32] %x, 3
  %x.fca.4.extract = extractvalue [500 x i32] %x, 4
  %x.fca.5.extract = extractvalue [500 x i32] %x, 5


until 500. I put the number to 50000 and it won’t stop.


This is puzzling. I am not sure why must a store command be expanded to a
sequence of etractvalues then stores? Is there a way to turn off this particular
optimization without turning off the whole optimization?

Or am I looking at the wrong way to do this simple task?


Thanks,
Jason Hu
https://hustmphrrr.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200412/81912338/attachment.html>

Neil Henning via llvm-dev

2020-Apr-13 13:10 UTC

head link

[llvm-dev] Optimization generate super long function definition

Hey Jason Hu,

So I think this is the SROA pass breaking up the load/store of the
aggregate (the [500 x i32] array) into individual load/stores so that it
can see if any are unused or can have their stores forwarded. This is a bit
of LLVM that I personally find pretty dumb - but if you look at how Clang
generates code for your above pattern it'll output memcpy intrinsics
instead of doing load/stores, which gets around this issue.

Note: the compiler hasn't hung, it's just spinning and comparing a few
thousand instructions generated from SROA against each other, you might
need to wait until the end of the universe but it should complete
eventually ;)

Cheers,
-Neil.

On Mon, Apr 13, 2020 at 12:37 AM Jason -Zhong Sheng- Hu via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi all,
>
>
> sorry to have sent the same question around. I am quite desperately
> looking for a solution to this problem and I figured the mailing list is
> the best bet.
>
>
> In my code, I generate the following function:
>
> define i32 @gl.qi([500 x i32] %x, i32 %i) {
> entry:
>   %x. = alloca [500 x i32]
>   %i. = alloca i32
>   %0 = alloca [500 x i32]
>   store [500 x i32] %x, [500 x i32]* %x.
>   store i32 %i, i32* %i.
>   %x.1 = load [500 x i32], [500 x i32]* %x.
>   %i.2 = load i32, i32* %i.
>   store [500 x i32] %x.1, [500 x i32]* %0
>   %1 = icmp slt i32 %i.2, 500
>   br i1 %1, label %in-bound, label %out-of-bound
>
> out-of-bound:                                     ; preds = %entry
>   call void @gen.panic(i8* getelementptr inbounds ([22 x i8], [22 x i8]*
@pool.str.2, i32 0, i32 0))
>   unreachable
>
> in-bound:                                         ; preds = %entry
>   %2 = getelementptr inbounds [500 x i32], [500 x i32]* %0, i32 0, i32 %i.2
>   %idx = load i32, i32* %2
>   ret i32 %idx
> }
>
> the high level functionality is to use %i to index %x, and if %i is out
> of bound, a panic function is called instead.
>
>
> consider the store line:
>
>   store [500 x i32] %x, [500 x i32]* %x.
>
> once I pass this function to opt -O1 -S --verify --verify-each, it
> generates code like this:
>
> define i32 @gl.qi([500 x i32] %x, i32 %i) local_unnamed_addr {
> entry:
>   %0 = alloca [500 x i32], align 4
>   %x.fca.0.extract = extractvalue [500 x i32] %x, 0
>   %x.fca.1.extract = extractvalue [500 x i32] %x, 1
>   %x.fca.2.extract = extractvalue [500 x i32] %x, 2
>   %x.fca.3.extract = extractvalue [500 x i32] %x, 3
>   %x.fca.4.extract = extractvalue [500 x i32] %x, 4
>   %x.fca.5.extract = extractvalue [500 x i32] %x, 5
>
> until 500. I put the number to 50000 and it won’t stop.
>
>
> This is puzzling. I am not sure why must a store command be expanded to a
> sequence of etractvalues then stores? Is there a way to turn off this
> particular optimization without turning off the whole optimization?
>
> Or am I looking at the wrong way to do this simple task?
>
>
> *Thanks,*
> *Jason Hu*
> *https://hustmphrrr.github.io/ <https://hustmphrrr.github.io/>*
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

-- 
Neil Henning
Senior Software Engineer Compiler
unity.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200413/ae7f1ef3/attachment.html>

Jason -Zhong Sheng- Hu via llvm-dev

2020-Apr-13 22:43 UTC

head link

[llvm-dev] Optimization generate super long function definition

Hey Neil,

thank you for replying and the information.

I think it is dumb... I have changed my implementation to passing a byval array
pointer. In this case, this optimization doesn't trigger.

I think the expansion in this optimization is quite useless because byval array
pointer achieves the same (I suppose) while the generated size remains small. 
This reveals unnecessarily complex execution detail.

Thanks,
Jason Hu
https://hustmphrrr.github.io/
________________________________
From: Neil Henning <neil.henning at unity3d.com>
Sent: April 13, 2020 9:10 AM
To: Jason -Zhong Sheng- Hu <fdhzs2010 at hotmail.com>
Cc: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] Optimization generate super long function definition

Hey Jason Hu,

So I think this is the SROA pass breaking up the load/store of the aggregate
(the [500 x i32] array) into individual load/stores so that it can see if any
are unused or can have their stores forwarded. This is a bit of LLVM that I
personally find pretty dumb - but if you look at how Clang generates code for
your above pattern it'll output memcpy intrinsics instead of doing
load/stores, which gets around this issue.

Note: the compiler hasn't hung, it's just spinning and comparing a few
thousand instructions generated from SROA against each other, you might need to
wait until the end of the universe but it should complete eventually ;)

Cheers,
-Neil.

On Mon, Apr 13, 2020 at 12:37 AM Jason -Zhong Sheng- Hu via llvm-dev
<llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:

Hi all,


sorry to have sent the same question around. I am quite desperately looking for
a solution to this problem and I figured the mailing list is the best bet.


In my code, I generate the following function:

define i32 @gl.qi([500 x i32] %x, i32 %i) {
entry:
  %x. = alloca [500 x i32]
  %i. = alloca i32
  %0 = alloca [500 x i32]
  store [500 x i32] %x, [500 x i32]* %x.
  store i32 %i, i32* %i.
  %x.1 = load [500 x i32], [500 x i32]* %x.
  %i.2 = load i32, i32* %i.
  store [500 x i32] %x.1, [500 x i32]* %0
  %1 = icmp slt i32 %i.2, 500
  br i1 %1, label %in-bound, label %out-of-bound

out-of-bound:                                     ; preds = %entry
  call void @gen.panic(i8* getelementptr inbounds ([22 x i8], [22 x i8]*
@pool.str.2, i32 0, i32 0))
  unreachable

in-bound:                                         ; preds = %entry
  %2 = getelementptr inbounds [500 x i32], [500 x i32]* %0, i32 0, i32 %i.2
  %idx = load i32, i32* %2
  ret i32 %idx
}


the high level functionality is to use %i to index %x, and if %i is out of
bound, a panic function is called instead.


consider the store line:

  store [500 x i32] %x, [500 x i32]* %x.


once I pass this function to opt -O1 -S --verify --verify-each, it generates
code like this:

define i32 @gl.qi([500 x i32] %x, i32 %i) local_unnamed_addr {
entry:
  %0 = alloca [500 x i32], align 4
  %x.fca.0.extract = extractvalue [500 x i32] %x, 0
  %x.fca.1.extract = extractvalue [500 x i32] %x, 1
  %x.fca.2.extract = extractvalue [500 x i32] %x, 2
  %x.fca.3.extract = extractvalue [500 x i32] %x, 3
  %x.fca.4.extract = extractvalue [500 x i32] %x, 4
  %x.fca.5.extract = extractvalue [500 x i32] %x, 5


until 500. I put the number to 50000 and it won’t stop.


This is puzzling. I am not sure why must a store command be expanded to a
sequence of etractvalues then stores? Is there a way to turn off this particular
optimization without turning off the whole optimization?

Or am I looking at the wrong way to do this simple task?


Thanks,
Jason Hu
https://hustmphrrr.github.io/
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


--
[https://unity3d.com/profiles/unity3d/themes/unity/images/ui/other/unity-logo-dark-email.png]
Neil Henning
Senior Software Engineer Compiler
unity.com<http://unity.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200413/0d308301/attachment.html>

llvm dev - Apr 2020 - Optimization generate super long function definition

[llvm-dev] Optimization generate super long function definition

[llvm-dev] Optimization generate super long function definition

[llvm-dev] Optimization generate super long function definition