thr3ads.net - llvm dev - [llvm-dev] InstCombine question on combineLoadToOperationType [Nov 2016]

If this information is useful, please help other people find it:
Share via:

Pete Couperus via llvm-dev

2016-Nov-16 00:22 UTC

[llvm-dev] InstCombine question on combineLoadToOperationType

Hello,

Context: We have a backend where v32i1 is a Legal type, but the storage for
v32i1 is not 32-bits/uses a different instruction sequence.
We ran into an issue because combineLoadToOperationType changed v32i1 loads into
i32 loads, so a sequence like:
define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
  %a = load <32 x i1>, <32 x i1>* %A
  store <32 x i1> %a, <32 x i1>* %B
  ret void
}

Is transformed to:
define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
  %1 = bitcast <32 x i1>* %A to i32*
  %a1 = load i32, i32* %1, align 4
  %2 = bitcast <32 x i1>* %B to i32*
  store i32 %a1, i32* %2, align 4
  ret void
}

This looks to be intentional.
Is there a way to specify in the data-layout that v32i1 storage is not 32-bits?
Absent that, is there any other reliable way to retain the original vector
loads/store without just disabling this part of InstCombine?
Or is it the backend's responsibility to try and work with this?
Thanks!

Pete

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/90249a48/attachment-0001.html>

Friedman, Eli via llvm-dev

2016-Nov-16 19:23 UTC

head link

[llvm-dev] InstCombine question on combineLoadToOperationType

On 11/15/2016 4:22 PM, Pete Couperus via llvm-dev wrote:>
> Hello,
>
> Context: We have a backend where v32i1 is a Legal type, but the 
> storage for v32i1 is not 32-bits/uses a different instruction sequence.
>
> We ran into an issue because combineLoadToOperationType changed v32i1 
> loads into i32 loads, so a sequence like:
>
> define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
>
>   %a = load <32 x i1>, <32 x i1>* %A
>
>   store <32 x i1> %a, <32 x i1>* %B
>
>   ret void
>
> }
>
> Is transformed to:
>
> define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
>
>   %1 = bitcast <32 x i1>* %A to i32*
>
>   %a1 = load i32, i32* %1, align 4
>
>   %2 = bitcast <32 x i1>* %B to i32*
>
>   store i32 %a1, i32* %2, align 4
>
>   ret void
>
> }
>
> This looks to be intentional.
>
> Is there a way to specify in the data-layout that v32i1 storage is not 
> 32-bits?
>
No, not at the moment.  You could propose something, but you'd probably 
have a hard time convincing anyone it's necessary; nobody has cared 
about this for a very long time.
> Absent that, is there any other reliable way to retain the original 
> vector loads/store without just disabling this part of InstCombine?
>
No, and you'll run into other problems (e.g. alias analysis) if the data 
layout lies about the size of a load or store.
> Or is it the backend’s responsibility to try and work with this?
>
Where are these loads coming from?  x86 without AVX512 doesn't have any 
convenient way generate code for a <32 x i1> store, but it doesn't 
matter because frontends don't generate <N x i1> loads and stores.

If you have a frontend which is generating loads and stores like this, 
you could probably change it to use some other sequence (like a 
platform-specific intrinsic, or some sequence involving sext/trunc).

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
Foundation Collaborative Project

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161116/acaf828c/attachment.html>

Pete Couperus via llvm-dev

2016-Nov-17 16:28 UTC

head link

[llvm-dev] InstCombine question on combineLoadToOperationType

On 11/15/2016 4:22 PM, Pete Couperus via llvm-dev wrote:
Hello,

Context: We have a backend where v32i1 is a Legal type, but the storage for
v32i1 is not 32-bits/uses a different instruction sequence.
We ran into an issue because combineLoadToOperationType changed v32i1 loads into
i32 loads, so a sequence like:
define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
  %a = load <32 x i1>, <32 x i1>* %A
  store <32 x i1> %a, <32 x i1>* %B
  ret void
}

Is transformed to:
define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
  %1 = bitcast <32 x i1>* %A to i32*
  %a1 = load i32, i32* %1, align 4
  %2 = bitcast <32 x i1>* %B to i32*
  store i32 %a1, i32* %2, align 4
  ret void
}

This looks to be intentional.
Is there a way to specify in the data-layout that v32i1 storage is not 32-bits?

No, not at the moment.  You could propose something, but you'd probably have
a hard time convincing anyone it's necessary; nobody has cared about this
for a very long time.


Absent that, is there any other reliable way to retain the original vector
loads/store without just disabling this part of InstCombine?

No, and you'll run into other problems (e.g. alias analysis) if the data
layout lies about the size of a load or store.


Or is it the backend’s responsibility to try and work with this?

Where are these loads coming from?  x86 without AVX512 doesn't have any
convenient way generate code for a <32 x i1> store, but it doesn't
matter because frontends don't generate <N x i1> loads and stores.

If you have a frontend which is generating loads and stores like this, you could
probably change it to use some other sequence (like a platform-specific
intrinsic, or some sequence involving sext/trunc).


We do have a frontend that can generate <32 x i1> loads/stores, though it
is rare that these are inst-combined to i32 loads/stores like here (these were
only illustrative examples).
I’m trying to decide what the best way to remedy this is, and this info and
suggestions help.
Thanks!

Pete

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161117/1e9aaff1/attachment.html>

Mehdi Amini via llvm-dev

2016-Nov-17 22:10 UTC

head link

[llvm-dev] InstCombine question on combineLoadToOperationType

> On Nov 16, 2016, at 11:23 AM, Friedman, Eli via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On 11/15/2016 4:22 PM, Pete Couperus via llvm-dev wrote:
>> Hello,
>>  
>> Context: We have a backend where v32i1 is a Legal type, but the storage
for v32i1 is not 32-bits/uses a different instruction sequence.
>> We ran into an issue because combineLoadToOperationType changed v32i1
loads into i32 loads, so a sequence like:
>> define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
>>   %a = load <32 x i1>, <32 x i1>* %A
>>   store <32 x i1> %a, <32 x i1>* %B
>>   ret void
>> }
>>  
>> Is transformed to:
>> define void @bits(<32 x i1>* %A, <32 x i1>* %B) {
>>   %1 = bitcast <32 x i1>* %A to i32*
>>   %a1 = load i32, i32* %1, align 4
>>   %2 = bitcast <32 x i1>* %B to i32*
>>   store i32 %a1, i32* %2, align 4
>>   ret void
>> }
>>  
>> This looks to be intentional. 
>> Is there a way to specify in the data-layout that v32i1 storage is not
32-bits?
> 
> No, not at the moment.  You could propose something, but you'd probably
have a hard time convincing anyone it's necessary; nobody has cared about
this for a very long time.
> 
>> Absent that, is there any other reliable way to retain the original
vector loads/store without just disabling this part of InstCombine?
> 
> No, and you'll run into other problems (e.g. alias analysis) if the
data layout lies about the size of a load or store.
> 
>> Or is it the backend’s responsibility to try and work with this?
> 
> Where are these loads coming from?  x86 without AVX512 doesn't have any
convenient way generate code for a <32 x i1> store, but it doesn't
matter because frontends don't generate <N x i1> loads and stores.
> 
> If you have a frontend which is generating loads and stores like this, you
could probably change it to use some other sequence (like a platform-specific
intrinsic, or some sequence involving sext/trunc).
Why not just generating the code with the proper storage? If <32 x i1> are
used where the storage is <32 x i8> (for example), it seems a bad idea to
lie to the IR and hide it with platform-specific intrinsic, right? I fear this
would cause other problem down the line in the optimizer.

— 
Mehdi

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161117/9eef4de4/attachment.html>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Nov 2016 - InstCombine question on combineLoadToOperationType

[llvm-dev] InstCombine question on combineLoadToOperationType

[llvm-dev] InstCombine question on combineLoadToOperationType

[llvm-dev] InstCombine question on combineLoadToOperationType

[llvm-dev] InstCombine question on combineLoadToOperationType

Possibly Parallel Threads