thr3ads.net - llvm dev - [llvm-dev] LSR [Apr 2017]

If this information is useful, please help other people find it:
Share via:

Jonas Paulsson via llvm-dev

2017-Apr-10 13:47 UTC

[llvm-dev] LSR

Hi,

I find that LSR is not helping enough on avoiding unfoldable offsets for 
SystemZ. When the loop has three stores with unfoldable offsets, LSR 
rewrites the IV in a good way. However, if adding another store with a 
foldable offset that fits already, LSR fails to rewrite the three stores.

And if I happen to add a too big *positive* offset (the first three were 
negative) instead of a foldable one, only the positive gets transformed.

* LSR is not rewriting the IV to have three foldable offsets rather than 
one.

* It would actually be preferred in this case to use a second address 
register for the offset that is too far away from the others.

Has anyone any idea on how to best handle this? Can LSR "split" an IV
to
use an extra register? Or would this need to be done in a target 
specific pass?

For a reduced test case for this problem, see 
https://bugs.llvm.org//show_bug.cgi?id=32548.

Thanks,

Jonas

Hal Finkel via llvm-dev

2017-Apr-11 02:53 UTC

head link

[llvm-dev] LSR

On 04/10/2017 08:47 AM, Jonas Paulsson via llvm-dev
wrote:> Hi,
>
> I find that LSR is not helping enough on avoiding unfoldable offsets 
> for SystemZ. When the loop has three stores with unfoldable offsets, 
> LSR rewrites the IV in a good way. However, if adding another store 
> with a foldable offset that fits already, LSR fails to rewrite the 
> three stores.
>
> And if I happen to add a too big *positive* offset (the first three 
> were negative) instead of a foldable one, only the positive gets 
> transformed.
>
> * LSR is not rewriting the IV to have three foldable offsets rather 
> than one.
>
> * It would actually be preferred in this case to use a second address 
> register for the offset that is too far away from the others.
>
> Has anyone any idea on how to best handle this? Can LSR "split"
an IV
> to use an extra register? Or would this need to be done in a target 
> specific pass?
When you say "an extra address register" would this imply LSR adding
an
additional PHI?

  -Hal
>
> For a reduced test case for this problem, see 
> https://bugs.llvm.org//show_bug.cgi?id=32548.
>
> Thanks,
>
> Jonas
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory

Jonas Paulsson via llvm-dev

2017-Apr-11 13:52 UTC

head link

[llvm-dev] LSR

>> Has anyone any idea on how to best handle this? Can LSR
"split" an IV
>> to use an extra register? Or would this need to be done in a target 
>> specific pass?
>
> When you say "an extra address register" would this imply LSR
adding
> an additional PHI?
>
>  -Hal
>Yes, that would have worked well at least in this type of loop. Can LSR 
do this?

I experimented with adding a check for 12 bit offsets distance in 
isProfitableIncrement() (checking against all members of the chain), 
which resulted in several chains being produced by LSR, instead of just 
one. The chains that now formed now had immediate offsets that were 
close to each other, so that they should result in addresses with 12 bit 
offsets. But, to my disappointment, LSR did not handle these different 
chains by generating new PHI-nodes for each of them (or by skipping 
those that ended up with just one store in the chain), but instead it 
still output the stores in the same way as before.

I also see that LSR is thinking in terms of increments between the 
memory accesses. In the loop I am working with it's disappointing to see 
that before each memory access, the base address is loaded into 
register, and then the offset is added, and then the access, which is 3 
instructions. It should have been just an add/sub after the previous 
access before the memory access, per LSRs intentions. I wonder where 
this is supposed to be handled: In some sort of target pre-isel pass 
that chains the GEPs? Or is this just folded more often on other targets?

/Jonas

Reasonably Related Threads

Search for more maybe matching threads

llvm dev - Apr 2017 - LSR

[llvm-dev] LSR

[llvm-dev] LSR

[llvm-dev] LSR

Reasonably Related Threads