thr3ads.net - llvm dev - [LLVMdev] Address space extension [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Pete Cooper

2013-Aug-08 01:16 UTC

[LLVMdev] Address space extension

On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>> This worries me a bit.  This would introduce language-specific
>> processing into SelectionDAG.  OpenCL maps address spaces one way,
other
>> languages map them in other ways.  Currently, it is the job of the
>> front-end to map pointers into the correct address space for the target
>> (hence the address space map in clang).  With (my understanding of)
this
>> proposal, there would be a pre-defined set of language-specific address
>> spaces that the target would need to know about. IMO it should be the
>> job of the front-end to do this mapping.
> 
> The begin of the discussion was about possible way to represent high level
address space information in the IR different from target address spaces (to
have the information orthogonally respect the mapping so to handle also those
targets that have the trivial mapping).
> 
> My interpretation of the solution proposed by Pete is that the frontend
emits metadata that describe address spaces (overlapping information and mapping
target specific). The instruction selection simply applis the mapping encoded in
the metadata. So there is no pre-defined set, but there is only a mapping
algorithm implemented in the instruction selection phase "table
driven", the table is encoded as metadata.I think its fair to have this be dealt with by targets instead of the front-end.
That way the optimizer can remain generic and use only the metadata.  CPU
targets will just map every address space to 0 as they have only a single
physical memory space.  GPU targets such as PTX and R600 can map to the actual
HW spaces they want.

This way you have the target specific information in the backend where I believe
it should be, and the front-end can target agnostic (note, I know, its not
really agnostic and already contains target specific information, but I just
don’t want to add more unless its really needed)

On the casting between address spaces topic "you can cast between the
generic address space and global/local/private, so there's also that to
consider.”.  This terrifies me.  I don’t know how to generate code for this on a
system which has disjoint physical memory without branching on every memory
access to that address space.> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Pete Cooper

2013-Aug-08 01:28 UTC

head link

[LLVMdev] Address space extension

On Aug 7, 2013, at 6:16 PM, Pete Cooper <peter_cooper at apple.com> wrote:
> 
> On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
> 
>> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>> This worries me a bit.  This would introduce language-specific
>>> processing into SelectionDAG.  OpenCL maps address spaces one way,
other
>>> languages map them in other ways.  Currently, it is the job of the
>>> front-end to map pointers into the correct address space for the
target
>>> (hence the address space map in clang).  With (my understanding of)
this
>>> proposal, there would be a pre-defined set of language-specific
address
>>> spaces that the target would need to know about. IMO it should be
the
>>> job of the front-end to do this mapping.
>> 
>> The begin of the discussion was about possible way to represent high
level address space information in the IR different from target address spaces
(to have the information orthogonally respect the mapping so to handle also
those targets that have the trivial mapping).
>> 
>> My interpretation of the solution proposed by Pete is that the frontend
emits metadata that describe address spaces (overlapping information and mapping
target specific). The instruction selection simply applis the mapping encoded in
the metadata. So there is no pre-defined set, but there is only a mapping
algorithm implemented in the instruction selection phase "table
driven", the table is encoded as metadata.
> I think its fair to have this be dealt with by targets instead of the
front-end.  That way the optimizer can remain generic and use only the metadata.
CPU targets will just map every address space to 0 as they have only a single
physical memory space.  GPU targets such as PTX and R600 can map to the actual
HW spaces they want.
> 
> This way you have the target specific information in the backend where I
believe it should be, and the front-end can target agnostic (note, I know, its
not really agnostic and already contains target specific information, but I just
don’t want to add more unless its really needed)
> 
> On the casting between address spaces topic "you can cast between the
generic address space and global/local/private, so there's also that to
consider.”.  This terrifies me.  I don’t know how to generate code for this on a
system which has disjoint physical memory without branching on every memory
access to that address space.Thinking about this more…  

If you do implement something like alias analysis for address spaces, then
casting between address spaces will be unsafe.

Lets say we have 3 address spaces: local, global, all.  Local and global are
disjoint, all is the union of the two.

If you cast local to all, or global to all, then alias analysis will be ok as an
‘all’ pointer already aliased local or global.  However, if you cast local to
all to global, then you now don’t know if other local pointers alias that so
called global pointer or not.  This is analogous to casting int* to float* in
C++.  Its undefined behavior according to the spec, and the compiler will
optimize it as such.

Personally i’d treat it as undefined behavior and implement your code as such,
but i’m no CL/CUDA expert so others may disagree.>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130807/ec376c91/attachment.html>

Michele Scandale

2013-Aug-08 01:38 UTC

head link

[LLVMdev] Address space extension

On 08/08/2013 03:16 AM, Pete Cooper wrote:>
> On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
>
>> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>> This worries me a bit.  This would introduce language-specific
>>> processing into SelectionDAG.  OpenCL maps address spaces one way,
other
>>> languages map them in other ways.  Currently, it is the job of the
>>> front-end to map pointers into the correct address space for the
target
>>> (hence the address space map in clang).  With (my understanding of)
this
>>> proposal, there would be a pre-defined set of language-specific
address
>>> spaces that the target would need to know about. IMO it should be
the
>>> job of the front-end to do this mapping.
>>
>> The begin of the discussion was about possible way to represent high
level address space information in the IR different from target address spaces
(to have the information orthogonally respect the mapping so to handle also
those targets that have the trivial mapping).
>>
>> My interpretation of the solution proposed by Pete is that the frontend
emits metadata that describe address spaces (overlapping information and mapping
target specific). The instruction selection simply applis the mapping encoded in
the metadata. So there is no pre-defined set, but there is only a mapping
algorithm implemented in the instruction selection phase "table
driven", the table is encoded as metadata.
> I think its fair to have this be dealt with by targets instead of the
front-end.  That way the optimizer can remain generic and use only the metadata.
CPU targets will just map every address space to 0 as they have only a single
physical memory space.  GPU targets such as PTX and R600 can map to the actual
HW spaces they want.
Why a backend should be responsible (meaning have knowledge) for a 
mapping between high level address spaces and low level address spaces?

Why X86 backend should be aware of opencl address spaces or any other 
address spaces?

Like for other aspects I see more direct and intuitive to anticipate 
target information in the frontend (this is already done and accepted) 
to have a middle-end and back-end source language dependent (no specific 
language knowledge is required, because different frontends could be 
built on top of this).

Maybe a way to decouple the frontend and the specific target is possible 
in order to have in the target independent part of the code-generator a 
support for a set of language with common concept (like opencl/cuda) but 
it's still language dependent!
> This way you have the target specific information in the backend where I
believe it should be, and the front-end can target agnostic (note, I know, its
not really agnostic and already contains target specific information, but I just
don’t want to add more unless its really needed)
>
> On the casting between address spaces topic "you can cast between the
generic address space and global/local/private, so there's also that to
consider.”.  This terrifies me.  I don’t know how to generate code for this on a
system which has disjoint physical memory without branching on every memory
access to that address space.
The OpenCL 2.0 specification says that a runtime resolution to a named 
address spaced is required in order to use a pointer in the generic 
address space.


-Michele

Pete Cooper

2013-Aug-08 01:52 UTC

head link

[LLVMdev] Address space extension

On Aug 7, 2013, at 6:38 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
> On 08/08/2013 03:16 AM, Pete Cooper wrote:
>> 
>> On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
>> 
>>> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>>> This worries me a bit.  This would introduce language-specific
>>>> processing into SelectionDAG.  OpenCL maps address spaces one
way, other
>>>> languages map them in other ways.  Currently, it is the job of
the
>>>> front-end to map pointers into the correct address space for
the target
>>>> (hence the address space map in clang).  With (my understanding
of) this
>>>> proposal, there would be a pre-defined set of language-specific
address
>>>> spaces that the target would need to know about. IMO it should
be the
>>>> job of the front-end to do this mapping.
>>> 
>>> The begin of the discussion was about possible way to represent
high level address space information in the IR different from target address
spaces (to have the information orthogonally respect the mapping so to handle
also those targets that have the trivial mapping).
>>> 
>>> My interpretation of the solution proposed by Pete is that the
frontend emits metadata that describe address spaces (overlapping information
and mapping target specific). The instruction selection simply applis the
mapping encoded in the metadata. So there is no pre-defined set, but there is
only a mapping algorithm implemented in the instruction selection phase
"table driven", the table is encoded as metadata.
>> I think its fair to have this be dealt with by targets instead of the
front-end.  That way the optimizer can remain generic and use only the metadata.
CPU targets will just map every address space to 0 as they have only a single
physical memory space.  GPU targets such as PTX and R600 can map to the actual
HW spaces they want.
> 
> Why a backend should be responsible (meaning have knowledge) for a mapping
between high level address spaces and low level address spaces?Thats true.  I’m thinking entirely from the persecutive of the backend doing
CL/CUDA.  But actually LLVM is language agnostic.  That is still something the
metadata could solve.  The front-end could generate the metadata i suggested
earlier which will tell the backend how to do the mapping.  Then the backend
only needs to read the metadata.> 
> Why X86 backend should be aware of opencl address spaces or any other
address spaces?The only reason i can think of is that this allows the address space alias
analysis to occur, and all of the optimizations you might want to implement on
top of it.  Otherwise you’ll need the front-end to put everything in address
space 0 and you’ll have lost some opportunity to optimize in that way for
x86.> 
> Like for other aspects I see more direct and intuitive to anticipate target
information in the frontend (this is already done and accepted) to have a
middle-end and back-end source language dependent (no specific language
knowledge is required, because different frontends could be built on top of
this).
> 
> Maybe a way to decouple the frontend and the specific target is possible in
order to have in the target independent part of the code-generator a support for
a set of language with common concept (like opencl/cuda) but it's still
language dependent!Yes, that could work.  Actually the numbers are probably not the important
thing.  Its the names that really tell you what the address space is for.  The
backend needs to know what loading from a local means.  Its almost unimportant
what specific number a front-end chooses for that address space.  We know the
front-end is really going to choose 2 (from what you said earlier), but the
backend just needs to know how to load/store a local.

So perhaps the front-end should really be generating metadata which tells the
target what address space it chose for a memory space.  That is

!private_memory = metadata !{ i32 0 }
!global_memory = metadata !{ i32 1 }
!local_memory = metadata !{ i32 2 }
!constant_memory = metadata !{ i32 3 }

Unfortunately you’d have to essentially reserve those metadata names for your
use (better names than i chose of course), but this might be reasonable.  You
could alternately use the example I first gave, but just add a name field to it.

I guess targets would have to either assert or default to address space 0 when
they see an address space without associated metadata.
> 
>> This way you have the target specific information in the backend where
I believe it should be, and the front-end can target agnostic (note, I know, its
not really agnostic and already contains target specific information, but I just
don’t want to add more unless its really needed)
>> 
>> On the casting between address spaces topic "you can cast between
the generic address space and global/local/private, so there's also that to
consider.”.  This terrifies me.  I don’t know how to generate code for this on a
system which has disjoint physical memory without branching on every memory
access to that address space.
> 
> The OpenCL 2.0 specification says that a runtime resolution to a named
address spaced is required in order to use a pointer in the generic address
space.Ouch!  I can’t imagine thats good for performance on some architectures.  But at
least its been considered and defined.

Pete> 
> 
> -Michele
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130807/15d9eb8d/attachment.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Aug 2013 - [LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

Maybe Matching Threads