thr3ads.net - llvm dev - [llvm-dev] New intrinsic property IntrOnlyWrite [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Nicolai Hähnle via llvm-dev

2016-Mar-19 03:16 UTC

[llvm-dev] New intrinsic property IntrOnlyWrite

Hi,

I'd like to draw your attention to http://reviews.llvm.org/D18291, in 
which I propose a new intrinsic property for intrinsics that are lowered 
to instructions that mayStore, but are neither mayLoad nor hasSideEffects.

This is relevant for AMDGPU, where we have store instructions that don't 
operate on pointers. The codegen backend understands these perfectly 
well as stores, and so we can enable better scheduling decisions than if 
we forced these instruction to hasSideEffects.

In a perfect world, we'd be able to model the behavior of these load and 
store intrinsics via ReadWriteArgMem, but that would require massive 
changes in how LLVM thinks about memory locations and how to describe them.

This comparatively minor addition allows us to move forward with decent 
scheduling in codegen for the time being.

Cheers,
Nicolai

Philip Reames via llvm-dev

2016-Mar-19 19:47 UTC

head link

[llvm-dev] New intrinsic property IntrOnlyWrite

I'm generally in support of this change.  I haven't looked at the patch 
yet, but the direction seems worthwhile.

Note that we already have a writeonly predicate in a few places in the 
code (BasicAA being one).  If we do introduce the new intrinsic 
property, we should refactor all of these places to use the new 
mechanism.  This could be separate changes, but should be considered 
required as part of adding the new property.

Another way you might consider slicing the problem is to introduce a 
WriteOnlyArg property.  When combined with ArgMemOnly, this would more 
precisely model the pseudo store (right?) and is a better fit for the 
memset/memcpy use case already handled in BasicAA.

Philip

p.s. Inline comments below specific to your use case.

On 03/18/2016 08:16 PM, Nicolai Hähnle via llvm-dev
wrote:> Hi,
>
> I'd like to draw your attention to http://reviews.llvm.org/D18291, in 
> which I propose a new intrinsic property for intrinsics that are 
> lowered to instructions that mayStore, but are neither mayLoad nor 
> hasSideEffects.
>
> This is relevant for AMDGPU, where we have store instructions that 
> don't operate on pointers. The codegen backend understands these 
> perfectly well as stores, and so we can enable better scheduling 
> decisions than if we forced these instruction to hasSideEffects.
Can you give a bit more detail here?  Example, etc..?>
> In a perfect world, we'd be able to model the behavior of these load 
> and store intrinsics via ReadWriteArgMem, but that would require 
> massive changes in how LLVM thinks about memory locations and how to 
> describe them.This comments makes me think you might have a much deeper problem. Let's 
see an actual example first though; I might just be misreading your
intent.>
> This comparatively minor addition allows us to move forward with 
> decent scheduling in codegen for the time being.
>
> Cheers,
> Nicolai
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Mehdi Amini via llvm-dev

2016-Mar-19 21:25 UTC

head link

[llvm-dev] New intrinsic property IntrOnlyWrite

Hi,

Can you elaborate what is the impact at the IR level?
If the point is just about how you lower for you target, why are you needing an
IR level attribute? You backend if free to specialize the lowering for any
intrinsic regardless the IR level attributes.

-- 
Mehdi

> On Mar 18, 2016, at 8:16 PM, Nicolai Hähnle via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Hi,
> 
> I'd like to draw your attention to http://reviews.llvm.org/D18291, in
which I propose a new intrinsic property for intrinsics that are lowered to
instructions that mayStore, but are neither mayLoad nor hasSideEffects.
> 
> This is relevant for AMDGPU, where we have store instructions that
don't operate on pointers. The codegen backend understands these perfectly
well as stores, and so we can enable better scheduling decisions than if we
forced these instruction to hasSideEffects.
> 
> In a perfect world, we'd be able to model the behavior of these load
and store intrinsics via ReadWriteArgMem, but that would require massive changes
in how LLVM thinks about memory locations and how to describe them.
> 
> This comparatively minor addition allows us to move forward with decent
scheduling in codegen for the time being.
> 
> Cheers,
> Nicolai
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Nicolai Hähnle via llvm-dev

2016-Mar-21 15:54 UTC

head link

[llvm-dev] New intrinsic property IntrOnlyWrite

On 19.03.2016 14:47, Philip Reames wrote:> I'm generally in support of this change.  I haven't looked at the
patch
> yet, but the direction seems worthwhile.
>
> Note that we already have a writeonly predicate in a few places in the
> code (BasicAA being one).  If we do introduce the new intrinsic
> property, we should refactor all of these places to use the new
> mechanism.  This could be separate changes, but should be considered
> required as part of adding the new property.
That makes a lot of sense.
> Another way you might consider slicing the problem is to introduce a
> WriteOnlyArg property.  When combined with ArgMemOnly, this would more
> precisely model the pseudo store (right?) and is a better fit for the
> memset/memcpy use case already handled in BasicAA.
This is unsufficient for our use case. Let me try to explain with an 
example (also for your later comments). We have an intrinsic

declare void @llvm.amdgcn.buffer.store.format.f32(float, <4 x i32>, i32, 
i32, i1, i1) #1

that stores a 32-bit value in a memory location that is described by a 
"buffer descriptor" (the <4 x i32>), an i32 index and an i32
offset. You
can think of the buffer descriptor as a "heavy pointer" that contains
a
base address and can also contain additional information like a buffer 
size, a stride, and a data format. The given index is multiplied by the 
stride before being added to the base address; the offset is an 
additional byte offset. The resulting address is checked against the 
buffer size (and an out of bounds store is silently discarded, which 
fits the requirements of APIs like OpenGL). Furthermore, the data being 
stored may be subject to a format conversion (float to int, etc.). The 
additional i1 arguments enable provide some cache control.

If you have some advice on how to fit such a memory addressing model 
into the ArgMemOnly logic that would be appreciated, but from my reading 
of the code it appears that trying to make ArgMemOnly work with this 
would be a rather large project which we don't really have the resources 
to tackle right now.

Why am I pushing for IntrWriteOnly?

I suspect the merely IntrWriteOnly without a functional ArgMemOnly does 
not provide a lot of opportunity for optimization at the IR level.

However, the codegen backend does understand the resulting hardware 
instructions, which are marked mayLoad = 0, mayStore = 1, hasSideEffects 
= 0. Having mayLoad = 0 and hasSideEffects = 0 makes a difference for 
instruction scheduling.

If you try to map the intrinsic as-is, without IntrWriteOnly, onto such 
hardware instructions, TableGen will (correctly) complain about a 
mismatch of mayLoad and hasSideEffects.

So I'd like to do a minimal change that will make TableGen happy, but I 
do not want that change to be an ugly kludge.

I understand your concern about having this interact nicely with 
BasicAA. Perhaps both an IntrWriteOnlyMem and an IntrWriteOnlyArgMem 
should be added for orthogonality?

Cheers,
Nicolai
>
> Philip
>
> p.s. Inline comments below specific to your use case.
>
> On 03/18/2016 08:16 PM, Nicolai Hähnle via llvm-dev wrote:
>> Hi,
>>
>> I'd like to draw your attention to http://reviews.llvm.org/D18291,
in
>> which I propose a new intrinsic property for intrinsics that are
>> lowered to instructions that mayStore, but are neither mayLoad nor
>> hasSideEffects.
>>
>> This is relevant for AMDGPU, where we have store instructions that
>> don't operate on pointers. The codegen backend understands these
>> perfectly well as stores, and so we can enable better scheduling
>> decisions than if we forced these instruction to hasSideEffects.
> Can you give a bit more detail here?  Example, etc..?
>>
>> In a perfect world, we'd be able to model the behavior of these
load
>> and store intrinsics via ReadWriteArgMem, but that would require
>> massive changes in how LLVM thinks about memory locations and how to
>> describe them.
> This comments makes me think you might have a much deeper problem.
Let's
> see an actual example first though; I might just be misreading your intent.
>>
>> This comparatively minor addition allows us to move forward with
>> decent scheduling in codegen for the time being.
>>
>> Cheers,
>> Nicolai
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Nicolai Hähnle via llvm-dev

2016-Mar-21 15:58 UTC

head link

[llvm-dev] New intrinsic property IntrOnlyWrite

On 19.03.2016 16:25, Mehdi Amini wrote:> Hi,
>
> Can you elaborate what is the impact at the IR level?
> If the point is just about how you lower for you target, why are you
needing an IR level attribute? You backend if free to specialize the lowering
for any intrinsic regardless the IR level attributes.
As I explained in my reply to Philip, what I really need is a way to get 
TableGen to shut up about what it reasonably believes to be a mismatch 
between the properties of the intrinsic (which it conservatively 
believes to be mayLoad = 1, mayStore = 1, hasSideEffects = 1) and the 
hardware instruction (which is correctly mayLoad = 0, mayStore = 1, 
hasSideEffects = 0).

Indeed, write-only without an ArgMemOnly property may well be useless at 
the IR level. If you think that there is a better way to explain the 
situation to TableGen, please let me know.

Please see my mail to Philip for a more detailed explanation for why we 
are in this situation.

Thanks,
Nicolai

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Mar 2016 - New intrinsic property IntrOnlyWrite

[llvm-dev] New intrinsic property IntrOnlyWrite

[llvm-dev] New intrinsic property IntrOnlyWrite

[llvm-dev] New intrinsic property IntrOnlyWrite

[llvm-dev] New intrinsic property IntrOnlyWrite

[llvm-dev] New intrinsic property IntrOnlyWrite

Possibly Parallel Threads