thr3ads.net - llvm dev - [LLVMdev] Reducing Generic Address Space Usage [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Jingyue Wu

2014-Mar-26 00:07 UTC

[LLVMdev] Reducing Generic Address Space Usage

On Tue, Mar 25, 2014 at 3:21 PM, Matt Arsenault
<Matthew.Arsenault at amd.com>wrote:
>  On 03/25/2014 02:31 PM, Jingyue Wu wrote:
>
>
> However, we have three concerns on this:
> a) I doubt this optimization is valid for all targets, because LLVM
> language reference (
> http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction) says
> addrspacecast "can be a no-op cast or a complex value modification,
> depending on the target and the address space pair."
>
> I think most of the simple cast optimizations would be acceptable. The
> addrspacecasted pointer still needs to point to the same memory location,
> so changing an access to use a different address space would be OK. I think
> canonicalizing accesses to use the original address space of a casted
> pointer when possible would make sense.
>
"the address space conversion is legal then both result and operand refer
to the same memory location". I don't quite understand this sentence.
Does
the same memory location mean the same numeric value?

>
>
>   b) NVPTX and R600 have different address numbering for the generic
> address space, which makes things more complicated.
> c) We don't have a good understanding of the R600 backend.
>
>
> R600 currently does not support the flat address space instructions
> intended to use for the generic address space. I posted a patch a while ago
> that half added it, which I can try to work on finishing if it would help.
>
> I also do not understand how NVPTX uses address spaces, particularly how
> it can use 0 as the the generic address space.
>
NVPTX backend generates ld.f32 for reading from the generic address space.
There's no special machine instruction to read/write from/to the generic
address space in R600?

>
>
>   2. How effective do we want this optimization to be?
>
>  In the short term, I want it to be able to eliminate unnecessary
> non-generic-to-generic addrspacecasts the front-end generates for the NVPTX
> target. For example,
>
>  %p1 = addrspace i32 addrspace(3)* %p0 to i32*
> %v = load i32* %p1
>
>  =>
>
>  %v = load i32 addrspace(3)* %p0
>
>  We want similar optimization for store+addrspacecast and
> gep+addrspacecast as well.
>
>  In a long term, we could for sure improve this optimization to handle
> more instructions and more patterns.
>
>   I believe most of the cast simplifications that apply to bitcasts of
> pointers also apply to addrspacecast. I have some patches waiting that
> extend some of the more basic ones to understand addrspacecast (e.g.
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140120/202296.html),
> plus a few more that I haven't posted yet. Mostly they are little cast
> simplifications like your example in instcombine, but also SROA to
> eliminate allocas that are addrspacecasted.
>
We also think InstCombine is a good place to put this optimization, if we
decide to go with target-independent. Looking forward to your patches!

>
>
> -Matt
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140325/7214198d/attachment.html>

Matt Arsenault

2014-Mar-26 00:32 UTC

head link

[LLVMdev] Reducing Generic Address Space Usage

On 03/25/2014 05:07 PM, Jingyue Wu wrote:>
>
>
> On Tue, Mar 25, 2014 at 3:21 PM, Matt Arsenault 
> <Matthew.Arsenault at amd.com <mailto:Matthew.Arsenault at
amd.com>> wrote:
>
>     On 03/25/2014 02:31 PM, Jingyue Wu wrote:
>>
>>     However, we have three concerns on this:
>>     a) I doubt this optimization is valid for all targets, because
>>     LLVM language reference
>>     (http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction)
>>     says addrspacecast "can be a no-op cast or a complex value
>>     modification, depending on the target and the address space
pair."
>     I think most of the simple cast optimizations would be acceptable.
>     The addrspacecasted pointer still needs to point to the same
>     memory location, so changing an access to use a different address
>     space would be OK. I think canonicalizing accesses to use the
>     original address space of a casted pointer when possible would
>     make sense.
>
>
> "the address space conversion is legal then both result and operand 
> refer to the same memory location". I don't quite understand this 
> sentence. Does the same memory location mean the same numeric value?No, it means they could both have different values that point to the 
same physical location. Storing to a pointer in one address space should 
have the same effect as storing to the addrspacecasted pointer, though 
it might not use the same value or instructions to do so.

>
>
>>     b) NVPTX and R600 have different address numbering for the
>>     generic address space, which makes things more complicated.
>>     c) We don't have a good understanding of the R600 backend.
>>
>
>     R600 currently does not support the flat address space
>     instructions intended to use for the generic address space. I
>     posted a patch a while ago that half added it, which I can try to
>     work on finishing if it would help.
>
>     I also do not understand how NVPTX uses address spaces,
>     particularly how it can use 0 as the the generic address space.
>
>
> NVPTX backend generates ld.f32 for reading from the generic address 
> space. There's no special machine instruction to read/write from/to 
> the generic address space in R600?New hardware does have flat address space instructions, which is what my 
patch adds support for. They're just not defined in the target yet. This 
flat address space is separate different from 0 / the default. I think 
of addrspace(0) as the address space of allocas, so I don't understand 
how that can be consolidated with generic accesses of the other address 
spaces. Does NVPTX not differentiate between accesses of a generic 
pointer and private / alloca'd memory?
>
>
>>     2. How effective do we want this optimization to be?
>>
>>     In the short term, I want it to be able to eliminate unnecessary
>>     non-generic-to-generic addrspacecasts the front-end generates for
>>     the NVPTX target. For example,
>>
>>     %p1 = addrspace i32 addrspace(3)* %p0 to i32*
>>     %v = load i32* %p1
>>
>>     =>
>>
>>     %v = load i32 addrspace(3)* %p0
>>
>>     We want similar optimization for store+addrspacecast and
>>     gep+addrspacecast as well.
>>
>>     In a long term, we could for sure improve this optimization to
>>     handle more instructions and more patterns.
>>
>     I believe most of the cast simplifications that apply to bitcasts
>     of pointers also apply to addrspacecast. I have some patches
>     waiting that extend some of the more basic ones to understand
>     addrspacecast (e.g.
>    
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140120/202296.html),
>     plus a few more that I haven't posted yet. Mostly they are little
>     cast simplifications like your example in instcombine, but also
>     SROA to eliminate allocas that are addrspacecasted.
>
>
> We also think InstCombine is a good place to put this optimization, if 
> we decide to go with target-independent. Looking forward to your patches!
>
I think that strategy only gets you part of the way to ideal. For 
example, preferring to use the original address space works well for 
accesses to objects where you start with the known address space. You 
could also have a function with a generic address space argument casted 
back to a specific address space. Preferring the original address space 
in that case is the opposite of what you want, although I expect this 
case will end up being much less common in real code and will tend to go 
away after inlining.

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140325/91f15c06/attachment.html>

Jingyue Wu

2014-Mar-26 17:33 UTC

head link

[LLVMdev] Reducing Generic Address Space Usage

On Tue, Mar 25, 2014 at 5:32 PM, Matt Arsenault
<Matthew.Arsenault at amd.com>wrote:
>  On 03/25/2014 05:07 PM, Jingyue Wu wrote:
>
>
>
>
> On Tue, Mar 25, 2014 at 3:21 PM, Matt Arsenault <Matthew.Arsenault at
amd.com
> > wrote:
>
>>  On 03/25/2014 02:31 PM, Jingyue Wu wrote:
>>
>>
>> However, we have three concerns on this:
>> a) I doubt this optimization is valid for all targets, because LLVM
>> language reference (
>> http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction) says
>> addrspacecast "can be a no-op cast or a complex value
modification,
>> depending on the target and the address space pair."
>>
>>  I think most of the simple cast optimizations would be acceptable. The
>> addrspacecasted pointer still needs to point to the same memory
location,
>> so changing an access to use a different address space would be OK. I
think
>> canonicalizing accesses to use the original address space of a casted
>> pointer when possible would make sense.
>>
>
>  "the address space conversion is legal then both result and operand
refer
> to the same memory location". I don't quite understand this
sentence. Does
> the same memory location mean the same numeric value?
>
> No, it means they could both have different values that point to the same
> physical location. Storing to a pointer in one address space should have
> the same effect as storing to the addrspacecasted pointer, though it might
> not use the same value or instructions to do so.
>
>That makes sense. Thanks!

>
>
>
>
>>
>>
>>   b) NVPTX and R600 have different address numbering for the generic
>> address space, which makes things more complicated.
>> c) We don't have a good understanding of the R600 backend.
>>
>>
>>  R600 currently does not support the flat address space instructions
>> intended to use for the generic address space. I posted a patch a while
ago
>> that half added it, which I can try to work on finishing if it would
help.
>>
>> I also do not understand how NVPTX uses address spaces, particularly
how
>> it can use 0 as the the generic address space.
>>
>
>  NVPTX backend generates ld.f32 for reading from the generic address
> space. There's no special machine instruction to read/write from/to the
> generic address space in R600?
>
> New hardware does have flat address space instructions, which is what my
> patch adds support for. They're just not defined in the target yet.
This
> flat address space is separate different from 0 / the default. I think of
> addrspace(0) as the address space of allocas, so I don't understand how
> that can be consolidated with generic accesses of the other address spaces.
> Does NVPTX not differentiate between accesses of a generic pointer and
> private / alloca'd memory?
>
>See Justin's followup. Looks like this optimization can benefit local
accesses as well.

>
>
>
>>
>>
>>   2. How effective do we want this optimization to be?
>>
>>  In the short term, I want it to be able to eliminate unnecessary
>> non-generic-to-generic addrspacecasts the front-end generates for the
NVPTX
>> target. For example,
>>
>>  %p1 = addrspace i32 addrspace(3)* %p0 to i32*
>> %v = load i32* %p1
>>
>>  =>
>>
>>  %v = load i32 addrspace(3)* %p0
>>
>>  We want similar optimization for store+addrspacecast and
>> gep+addrspacecast as well.
>>
>>  In a long term, we could for sure improve this optimization to handle
>> more instructions and more patterns.
>>
>>    I believe most of the cast simplifications that apply to bitcasts of
>> pointers also apply to addrspacecast. I have some patches waiting that
>> extend some of the more basic ones to understand addrspacecast (e.g.
>>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140120/202296.html),
>> plus a few more that I haven't posted yet. Mostly they are little
cast
>> simplifications like your example in instcombine, but also SROA to
>> eliminate allocas that are addrspacecasted.
>>
>
>  We also think InstCombine is a good place to put this optimization, if
> we decide to go with target-independent. Looking forward to your patches!
>
>
> I think that strategy only gets you part of the way to ideal. For example,
> preferring to use the original address space works well for accesses to
> objects where you start with the known address space. You could also have a
> function with a generic address space argument casted back to a specific
> address space. Preferring the original address space in that case is the
> opposite of what you want, although I expect this case will end up being
> much less common in real code and will tend to go away after inlining.
>
>You're right. I overlooked this case. I doubt a CUDA program would even
uses generic-to-non-generic casts, because non-generic address space
qualifiers only qualify declarations.

The backend can indicate which address spaces it prefers using some flags
(e.g., preferNonGenericPointers as Justin suggested). InstCombine can then
look at these flags to decide what to do.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140326/f1db6ac2/attachment.html>

Philip Reames

2014-Mar-26 21:16 UTC

head link

[LLVMdev] Reducing Generic Address Space Usage

On 03/25/2014 05:32 PM, Matt Arsenault wrote:> On 03/25/2014 05:07 PM, Jingyue Wu wrote:
>>
>>
>>
>> On Tue, Mar 25, 2014 at 3:21 PM, Matt Arsenault 
>> <Matthew.Arsenault at amd.com <mailto:Matthew.Arsenault at
amd.com>> wrote:
>>
>>     On 03/25/2014 02:31 PM, Jingyue Wu wrote:
>>>
>>>     However, we have three concerns on this:
>>>     a) I doubt this optimization is valid for all targets, because
>>>     LLVM language reference
>>>    
(http://llvm.org/docs/LangRef.html#addrspacecast-to-instruction)
>>>     says addrspacecast "can be a no-op cast or a complex value
>>>     modification, depending on the target and the address space
pair."
>>     I think most of the simple cast optimizations would be
>>     acceptable. The addrspacecasted pointer still needs to point to
>>     the same memory location, so changing an access to use a
>>     different address space would be OK. I think canonicalizing
>>     accesses to use the original address space of a casted pointer
>>     when possible would make sense.
>>
>>
>> "the address space conversion is legal then both result and
operand
>> refer to the same memory location". I don't quite understand
this
>> sentence. Does the same memory location mean the same numeric value?
> No, it means they could both have different values that point to the 
> same physical location. Storing to a pointer in one address space 
> should have the same effect as storing to the addrspacecasted pointer, 
> though it might not use the same value or instructions to do so.This interpretation fits my reasoning as well.

I would assume it's legal to reason about aliasing (for value 
forwarding, etc..), but not to reason about the semantics of the exact 
load used.  (i.e. one could perform a runtime manipulation of the loaded 
value, the other might not)

Philip

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140326/8a33832d/attachment.html>

llvm dev - Mar 2014 - [LLVMdev] Reducing Generic Address Space Usage

[LLVMdev] Reducing Generic Address Space Usage

[LLVMdev] Reducing Generic Address Space Usage

[LLVMdev] Reducing Generic Address Space Usage

[LLVMdev] Reducing Generic Address Space Usage