thr3ads.net - llvm dev - [LLVMdev] Address space extension [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Pete Cooper

2013-Aug-08 01:52 UTC

[LLVMdev] Address space extension

On Aug 7, 2013, at 6:38 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
> On 08/08/2013 03:16 AM, Pete Cooper wrote:
>> 
>> On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
>> 
>>> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>>> This worries me a bit.  This would introduce language-specific
>>>> processing into SelectionDAG.  OpenCL maps address spaces one
way, other
>>>> languages map them in other ways.  Currently, it is the job of
the
>>>> front-end to map pointers into the correct address space for
the target
>>>> (hence the address space map in clang).  With (my understanding
of) this
>>>> proposal, there would be a pre-defined set of language-specific
address
>>>> spaces that the target would need to know about. IMO it should
be the
>>>> job of the front-end to do this mapping.
>>> 
>>> The begin of the discussion was about possible way to represent
high level address space information in the IR different from target address
spaces (to have the information orthogonally respect the mapping so to handle
also those targets that have the trivial mapping).
>>> 
>>> My interpretation of the solution proposed by Pete is that the
frontend emits metadata that describe address spaces (overlapping information
and mapping target specific). The instruction selection simply applis the
mapping encoded in the metadata. So there is no pre-defined set, but there is
only a mapping algorithm implemented in the instruction selection phase
"table driven", the table is encoded as metadata.
>> I think its fair to have this be dealt with by targets instead of the
front-end.  That way the optimizer can remain generic and use only the metadata.
CPU targets will just map every address space to 0 as they have only a single
physical memory space.  GPU targets such as PTX and R600 can map to the actual
HW spaces they want.
> 
> Why a backend should be responsible (meaning have knowledge) for a mapping
between high level address spaces and low level address spaces?Thats true.  I’m thinking entirely from the persecutive of the backend doing
CL/CUDA.  But actually LLVM is language agnostic.  That is still something the
metadata could solve.  The front-end could generate the metadata i suggested
earlier which will tell the backend how to do the mapping.  Then the backend
only needs to read the metadata.> 
> Why X86 backend should be aware of opencl address spaces or any other
address spaces?The only reason i can think of is that this allows the address space alias
analysis to occur, and all of the optimizations you might want to implement on
top of it.  Otherwise you’ll need the front-end to put everything in address
space 0 and you’ll have lost some opportunity to optimize in that way for
x86.> 
> Like for other aspects I see more direct and intuitive to anticipate target
information in the frontend (this is already done and accepted) to have a
middle-end and back-end source language dependent (no specific language
knowledge is required, because different frontends could be built on top of
this).
> 
> Maybe a way to decouple the frontend and the specific target is possible in
order to have in the target independent part of the code-generator a support for
a set of language with common concept (like opencl/cuda) but it's still
language dependent!Yes, that could work.  Actually the numbers are probably not the important
thing.  Its the names that really tell you what the address space is for.  The
backend needs to know what loading from a local means.  Its almost unimportant
what specific number a front-end chooses for that address space.  We know the
front-end is really going to choose 2 (from what you said earlier), but the
backend just needs to know how to load/store a local.

So perhaps the front-end should really be generating metadata which tells the
target what address space it chose for a memory space.  That is

!private_memory = metadata !{ i32 0 }
!global_memory = metadata !{ i32 1 }
!local_memory = metadata !{ i32 2 }
!constant_memory = metadata !{ i32 3 }

Unfortunately you’d have to essentially reserve those metadata names for your
use (better names than i chose of course), but this might be reasonable.  You
could alternately use the example I first gave, but just add a name field to it.

I guess targets would have to either assert or default to address space 0 when
they see an address space without associated metadata.
> 
>> This way you have the target specific information in the backend where
I believe it should be, and the front-end can target agnostic (note, I know, its
not really agnostic and already contains target specific information, but I just
don’t want to add more unless its really needed)
>> 
>> On the casting between address spaces topic "you can cast between
the generic address space and global/local/private, so there's also that to
consider.”.  This terrifies me.  I don’t know how to generate code for this on a
system which has disjoint physical memory without branching on every memory
access to that address space.
> 
> The OpenCL 2.0 specification says that a runtime resolution to a named
address spaced is required in order to use a pointer in the generic address
space.Ouch!  I can’t imagine thats good for performance on some architectures.  But at
least its been considered and defined.

Pete> 
> 
> -Michele
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130807/15d9eb8d/attachment.html>

Michele Scandale

2013-Aug-08 02:23 UTC

head link

[LLVMdev] Address space extension

On 08/08/2013 03:52 AM, Pete Cooper wrote:>> Why a backend should be responsible (meaning have knowledge) for a
>> mapping between high level address spaces and low level address spaces?
> Thats true.  I’m thinking entirely from the persecutive of the backend
> doing CL/CUDA.  But actually LLVM is language agnostic.  That is still
> something the metadata could solve.  The front-end could generate the
> metadata i suggested earlier which will tell the backend how to do the
> mapping.  Then the backend only needs to read the metadata.
 From here I understand that in the IR there are addrspace(N) where 
N=0,1,2,3,... according to the target independent mapping done by the 
frontend to represent different address spaces (for OpenCL 1.2 0 = 
private, 1 = global, 2 = local, 3 = constant).

Then the frontend emits metadata that contains the map from "language 
address spaces" to "target address spaces" (for X86 would be
0->0 1->0
2->0 3->0).

Finally the instruction selection will use these informations to perform 
the instruction selection correctly and tagging the machine instruction 
with both logical and physical address spaces.
>> Why X86 backend should be aware of opencl address spaces or any other
>> address spaces?
> The only reason i can think of is that this allows the address space
> alias analysis to occur, and all of the optimizations you might want to
> implement on top of it.  Otherwise you’ll need the front-end to put
> everything in address space 0 and you’ll have lost some opportunity to
> optimize in that way for x86.
The mapping phase will allow to have to have the backend precondition 
satisfied (no address spaces other than zero). Having in the IR and also 
after both informations the alias analysis should be feasible.
>> Like for other aspects I see more direct and intuitive to anticipate
>> target information in the frontend (this is already done and accepted)
>> to have a middle-end and back-end source language dependent (no
>> specific language knowledge is required, because different frontends
>> could be built on top of this).
>>
>> Maybe a way to decouple the frontend and the specific target is
>> possible in order to have in the target independent part of the
>> code-generator a support for a set of language with common concept
>> (like opencl/cuda) but it's still language dependent!
> Yes, that could work.  Actually the numbers are probably not the
> important thing.  Its the names that really tell you what the address
> space is for.  The backend needs to know what loading from a local
> means.  Its almost unimportant what specific number a front-end chooses
> for that address space.  We know the front-end is really going to choose
> 2 (from what you said earlier), but the backend just needs to know how
> to load/store a local.
>
> So perhaps the front-end should really be generating metadata which
> tells the target what address space it chose for a memory space.  That is
>
> !private_memory = metadata !{ i32 0 }
> !global_memory = metadata !{ i32 1 }
> !local_memory = metadata !{ i32 2 }
> !constant_memory = metadata !{ i32 3 }
>
> Unfortunately you’d have to essentially reserve those metadata names for
> your use (better names than i chose of course), but this might be
> reasonable.  You could alternately use the example I first gave, but
> just add a name field to it.
>
> I guess targets would have to either assert or default to address space
> 0 when they see an address space without associated metadata.
This part is not clear, still in the X86 backend private/global/local 
memories are meaningless. Indeed it is limited to a set of languages 
that support these abstractions.

IMO a more general solution would be to fully demand to the frontend the 
mapping resolution generating the map from logical to physical address 
spaces.

Considering also the fact that addrspace is used to support C address 
space extension that maps from C to physical numbered address spaces, 
maybe a default implicit identity function as mapping would be fine when 
no metadata are not provided.


Thanks again.

-Michele

Pete Cooper

2013-Aug-08 03:23 UTC

head link

[LLVMdev] Address space extension

On Aug 7, 2013, at 7:23 PM, Michele Scandale <michele.scandale at
gmail.com> wrote:
> On 08/08/2013 03:52 AM, Pete Cooper wrote:
>>> Why a backend should be responsible (meaning have knowledge) for a
>>> mapping between high level address spaces and low level address
spaces?
>> Thats true.  I’m thinking entirely from the persecutive of the backend
>> doing CL/CUDA.  But actually LLVM is language agnostic.  That is still
>> something the metadata could solve.  The front-end could generate the
>> metadata i suggested earlier which will tell the backend how to do the
>> mapping.  Then the backend only needs to read the metadata.
> 
> From here I understand that in the IR there are addrspace(N) where
N=0,1,2,3,... according to the target independent mapping done by the frontend
to represent different address spaces (for OpenCL 1.2 0 = private, 1 = global, 2
= local, 3 = constant).
> 
> Then the frontend emits metadata that contains the map from "language
address spaces" to "target address spaces" (for X86 would be
0->0 1->0 2->0 3->0).
> 
> Finally the instruction selection will use these informations to perform
the instruction selection correctly and tagging the machine instruction with
both logical and physical address spaces.
Sounds good.> 
>>> Why X86 backend should be aware of opencl address spaces or any
other
>>> address spaces?
>> The only reason i can think of is that this allows the address space
>> alias analysis to occur, and all of the optimizations you might want to
>> implement on top of it.  Otherwise you’ll need the front-end to put
>> everything in address space 0 and you’ll have lost some opportunity to
>> optimize in that way for x86.
> 
> The mapping phase will allow to have to have the backend precondition
satisfied (no address spaces other than zero). Having in the IR and also after
both informations the alias analysis should be feasible.
> 
>>> Like for other aspects I see more direct and intuitive to
anticipate
>>> target information in the frontend (this is already done and
accepted)
>>> to have a middle-end and back-end source language dependent (no
>>> specific language knowledge is required, because different
frontends
>>> could be built on top of this).
>>> 
>>> Maybe a way to decouple the frontend and the specific target is
>>> possible in order to have in the target independent part of the
>>> code-generator a support for a set of language with common concept
>>> (like opencl/cuda) but it's still language dependent!
>> Yes, that could work.  Actually the numbers are probably not the
>> important thing.  Its the names that really tell you what the address
>> space is for.  The backend needs to know what loading from a local
>> means.  Its almost unimportant what specific number a front-end chooses
>> for that address space.  We know the front-end is really going to
choose
>> 2 (from what you said earlier), but the backend just needs to know how
>> to load/store a local.
>> 
>> So perhaps the front-end should really be generating metadata which
>> tells the target what address space it chose for a memory space.  That
is
>> 
>> !private_memory = metadata !{ i32 0 }
>> !global_memory = metadata !{ i32 1 }
>> !local_memory = metadata !{ i32 2 }
>> !constant_memory = metadata !{ i32 3 }
>> 
>> Unfortunately you’d have to essentially reserve those metadata names
for
>> your use (better names than i chose of course), but this might be
>> reasonable.  You could alternately use the example I first gave, but
>> just add a name field to it.
>> 
>> I guess targets would have to either assert or default to address space
>> 0 when they see an address space without associated metadata.
> 
> This part is not clear, still in the X86 backend private/global/local
memories are meaningless. Indeed it is limited to a set of languages that
support these abstractions.Yeah.  They address spaces don’t mean anything in terms of instruction selection
for x86.  You mentioned earlier putting the physical and logical address spaces
on the machine instr.  If you wanted you could use these to perform code motion
on x86 which would otherwise not be possible, but thats the only reason I can
think of for why x86 would benefit from address space information in the
backend.> 
> IMO a more general solution would be to fully demand to the frontend the
mapping resolution generating the map from logical to physical address spaces.
> 
> Considering also the fact that addrspace is used to support C address space
extension that maps from C to physical numbered address spaces, maybe a default
implicit identity function as mapping would be fine when no metadata are not
provided.Yeah, I think a default identify mapping is a good idea.  x86 for example uses
address spaces 256 and 257 for the fs and gs segments.  Without this default
mapping, tests using those segments would fail.

Thanks,
Pete> 
> 
> Thanks again.
> 
> -Michele
> 
> 
>

Justin Holewinski

2013-Aug-08 12:05 UTC

head link

[LLVMdev] Address space extension

On Wed, Aug 7, 2013 at 9:52 PM, Pete Cooper <peter_cooper at apple.com>
wrote:
>
> On Aug 7, 2013, at 6:38 PM, Michele Scandale <michele.scandale at
gmail.com>
> wrote:
>
> On 08/08/2013 03:16 AM, Pete Cooper wrote:
>
>
> On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com>
> wrote:
>
> On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>
> This worries me a bit.  This would introduce language-specific
> processing into SelectionDAG.  OpenCL maps address spaces one way, other
> languages map them in other ways.  Currently, it is the job of the
> front-end to map pointers into the correct address space for the target
> (hence the address space map in clang).  With (my understanding of) this
> proposal, there would be a pre-defined set of language-specific address
> spaces that the target would need to know about. IMO it should be the
> job of the front-end to do this mapping.
>
>
> The begin of the discussion was about possible way to represent high level
> address space information in the IR different from target address spaces
> (to have the information orthogonally respect the mapping so to handle also
> those targets that have the trivial mapping).
>
> My interpretation of the solution proposed by Pete is that the frontend
> emits metadata that describe address spaces (overlapping information and
> mapping target specific). The instruction selection simply applis the
> mapping encoded in the metadata. So there is no pre-defined set, but there
> is only a mapping algorithm implemented in the instruction selection phase
> "table driven", the table is encoded as metadata.
>
> I think its fair to have this be dealt with by targets instead of the
> front-end.  That way the optimizer can remain generic and use only the
> metadata.  CPU targets will just map every address space to 0 as they have
> only a single physical memory space.  GPU targets such as PTX and R600 can
> map to the actual HW spaces they want.
>
>
> Why a backend should be responsible (meaning have knowledge) for a mapping
> between high level address spaces and low level address spaces?
>
> Thats true.  I’m thinking entirely from the persecutive of the backend
> doing CL/CUDA.  But actually LLVM is language agnostic.  That is still
> something the metadata could solve.  The front-end could generate the
> metadata i suggested earlier which will tell the backend how to do the
> mapping.  Then the backend only needs to read the metadata.
>
>
> Why X86 backend should be aware of opencl address spaces or any other
> address spaces?
>
> The only reason i can think of is that this allows the address space alias
> analysis to occur, and all of the optimizations you might want to implement
> on top of it.  Otherwise you’ll need the front-end to put everything in
> address space 0 and you’ll have lost some opportunity to optimize in that
> way for x86.
>
>
> Like for other aspects I see more direct and intuitive to anticipate
> target information in the frontend (this is already done and accepted) to
> have a middle-end and back-end source language dependent (no specific
> language knowledge is required, because different frontends could be built
> on top of this).
>
> Maybe a way to decouple the frontend and the specific target is possible
> in order to have in the target independent part of the code-generator a
> support for a set of language with common concept (like opencl/cuda) but
> it's still language dependent!
>
> Yes, that could work.  Actually the numbers are probably not the important
> thing.  Its the names that really tell you what the address space is for.
>  The backend needs to know what loading from a local means.  Its almost
> unimportant what specific number a front-end chooses for that address
> space.  We know the front-end is really going to choose 2 (from what you
> said earlier), but the backend just needs to know how to load/store a
local.
>
> So perhaps the front-end should really be generating metadata which tells
> the target what address space it chose for a memory space.  That is
>
> !private_memory = metadata !{ i32 0 }
> !global_memory = metadata !{ i32 1 }
> !local_memory = metadata !{ i32 2 }
> !constant_memory = metadata !{ i32 3 }
>
This is specific to an OpenCL front-end.  How would this translate to a
language with a different memory hierarchy?

I would also like to preserve the ability for front-ends to directly assign
address spaces in a target-dependent manner.  Currently, I can write IR
that explicitly assigns global variables to the PTX "shared" address
space
(for example).  Under this proposal, I would need to use address space 2
(because that is what has been decreed as OpenCL "local"), and insert
meta-data that tells the PTX back-end to map this to its "shared"
address
space.  Is that correct?

>
> Unfortunately you’d have to essentially reserve those metadata names for
> your use (better names than i chose of course), but this might be
> reasonable.  You could alternately use the example I first gave, but just
> add a name field to it.
>
> I guess targets would have to either assert or default to address space 0
> when they see an address space without associated metadata.
>
>
> This way you have the target specific information in the backend where I
> believe it should be, and the front-end can target agnostic (note, I know,
> its not really agnostic and already contains target specific information,
> but I just don’t want to add more unless its really needed)
>
> On the casting between address spaces topic "you can cast between the
> generic address space and global/local/private, so there's also that to
> consider.”.  This terrifies me.  I don’t know how to generate code for this
> on a system which has disjoint physical memory without branching on every
> memory access to that address space.
>
>
> The OpenCL 2.0 specification says that a runtime resolution to a named
> address spaced is required in order to use a pointer in the generic address
> space.
>
> Ouch!  I can’t imagine thats good for performance on some architectures.
>  But at least its been considered and defined.
>
> Pete
>
>
>
> -Michele
>
>
>

-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/ebdcd410/attachment.html>

Michele Scandale

2013-Aug-08 15:15 UTC

head link

[LLVMdev] Address space extension

On 08/08/2013 02:05 PM, Justin Holewinski wrote:> On Wed, Aug 7, 2013 at 9:52 PM, Pete Cooper <peter_cooper at apple.com
> <mailto:peter_cooper at apple.com>> wrote:
>
>
>     On Aug 7, 2013, at 6:38 PM, Michele Scandale
>     <michele.scandale at gmail.com <mailto:michele.scandale at
gmail.com>> wrote:
>
>>     On 08/08/2013 03:16 AM, Pete Cooper wrote:
>>>
>>>     On Aug 7, 2013, at 5:12 PM, Michele Scandale
>>>     <michele.scandale at gmail.com <mailto:michele.scandale
at gmail.com>>
>>>     wrote:
>>>
>>>>     On 08/08/2013 02:02 AM, Justin Holewinski wrote:
>>>>>     This worries me a bit.  This would introduce
language-specific
>>>>>     processing into SelectionDAG.  OpenCL maps address
spaces one
>>>>>     way, other
>>>>>     languages map them in other ways.  Currently, it is the
job of the
>>>>>     front-end to map pointers into the correct address
space for
>>>>>     the target
>>>>>     (hence the address space map in clang).  With (my
understanding
>>>>>     of) this
>>>>>     proposal, there would be a pre-defined set of
language-specific
>>>>>     address
>>>>>     spaces that the target would need to know about. IMO it
should
>>>>>     be the
>>>>>     job of the front-end to do this mapping.
>>>>
>>>>     The begin of the discussion was about possible way to
represent
>>>>     high level address space information in the IR different
from
>>>>     target address spaces (to have the information orthogonally
>>>>     respect the mapping so to handle also those targets that
have
>>>>     the trivial mapping).
>>>>
>>>>     My interpretation of the solution proposed by Pete is that
the
>>>>     frontend emits metadata that describe address spaces
>>>>     (overlapping information and mapping target specific). The
>>>>     instruction selection simply applis the mapping encoded in
the
>>>>     metadata. So there is no pre-defined set, but there is only
a
>>>>     mapping algorithm implemented in the instruction selection
phase
>>>>     "table driven", the table is encoded as metadata.
>>>     I think its fair to have this be dealt with by targets instead
of
>>>     the front-end.  That way the optimizer can remain generic and
use
>>>     only the metadata.  CPU targets will just map every address
space
>>>     to 0 as they have only a single physical memory space.  GPU
>>>     targets such as PTX and R600 can map to the actual HW spaces
they
>>>     want.
>>
>>     Why a backend should be responsible (meaning have knowledge) for a
>>     mapping between high level address spaces and low level address
>>     spaces?
>     Thats true.  I’m thinking entirely from the persecutive of the
>     backend doing CL/CUDA.  But actually LLVM is language agnostic.
>       That is still something the metadata could solve.  The front-end
>     could generate the metadata i suggested earlier which will tell the
>     backend how to do the mapping.  Then the backend only needs to read
>     the metadata.
>
>>
>>     Why X86 backend should be aware of opencl address spaces or any
>>     other address spaces?
>     The only reason i can think of is that this allows the address space
>     alias analysis to occur, and all of the optimizations you might want
>     to implement on top of it.  Otherwise you’ll need the front-end to
>     put everything in address space 0 and you’ll have lost some
>     opportunity to optimize in that way for x86.
>
>>
>>     Like for other aspects I see more direct and intuitive to
>>     anticipate target information in the frontend (this is already
>>     done and accepted) to have a middle-end and back-end source
>>     language dependent (no specific language knowledge is required,
>>     because different frontends could be built on top of this).
>>
>>     Maybe a way to decouple the frontend and the specific target is
>>     possible in order to have in the target independent part of the
>>     code-generator a support for a set of language with common concept
>>     (like opencl/cuda) but it's still language dependent!
>     Yes, that could work.  Actually the numbers are probably not the
>     important thing.  Its the names that really tell you what the
>     address space is for.  The backend needs to know what loading from a
>     local means.  Its almost unimportant what specific number a
>     front-end chooses for that address space.  We know the front-end is
>     really going to choose 2 (from what you said earlier), but the
>     backend just needs to know how to load/store a local.
>
>     So perhaps the front-end should really be generating metadata which
>     tells the target what address space it chose for a memory space.
>       That is
>
>     !private_memory = metadata !{ i32 0 }
>     !global_memory = metadata !{ i32 1 }
>     !local_memory = metadata !{ i32 2 }
>     !constant_memory = metadata !{ i32 3 }
>
>
> This is specific to an OpenCL front-end.  How would this translate to a
> language with a different memory hierarchy?
>
> I would also like to preserve the ability for front-ends to directly
> assign address spaces in a target-dependent manner.  Currently, I can
> write IR that explicitly assigns global variables to the PTX
"shared"
> address space (for example).  Under this proposal, I would need to use
> address space 2 (because that is what has been decreed as OpenCL
> "local"), and insert meta-data that tells the PTX back-end to map
this
> to its "shared" address space.  Is that correct?
The address space representation as numbers done by the front-end I 
think it would be language dependent: values used in CUDA may be 
different from the one used in OpenCL.
I understand that it may be better to have it also target-dependent.
But this is an decision for the frontend implementation.

-Michele

Tom Stellard

2013-Aug-08 18:55 UTC

head link

[LLVMdev] Address space extension

On Thu, Aug 08, 2013 at 08:05:33AM -0400, Justin Holewinski
wrote:> On Wed, Aug 7, 2013 at 9:52 PM, Pete Cooper <peter_cooper at
apple.com> wrote:
> 
> >
> > On Aug 7, 2013, at 6:38 PM, Michele Scandale <michele.scandale at
gmail.com>
> > wrote:
> >
> > On 08/08/2013 03:16 AM, Pete Cooper wrote:
> >
> >
> > On Aug 7, 2013, at 5:12 PM, Michele Scandale <michele.scandale at
gmail.com>
> > wrote:
> >
> > On 08/08/2013 02:02 AM, Justin Holewinski wrote:
> >
> > This worries me a bit.  This would introduce language-specific
> > processing into SelectionDAG.  OpenCL maps address spaces one way,
other
> > languages map them in other ways.  Currently, it is the job of the
> > front-end to map pointers into the correct address space for the
target
> > (hence the address space map in clang).  With (my understanding of)
this
> > proposal, there would be a pre-defined set of language-specific
address
> > spaces that the target would need to know about. IMO it should be the
> > job of the front-end to do this mapping.
> >
> >
> > The begin of the discussion was about possible way to represent high
level
> > address space information in the IR different from target address
spaces
> > (to have the information orthogonally respect the mapping so to handle
also
> > those targets that have the trivial mapping).
> >
> > My interpretation of the solution proposed by Pete is that the
frontend
> > emits metadata that describe address spaces (overlapping information
and
> > mapping target specific). The instruction selection simply applis the
> > mapping encoded in the metadata. So there is no pre-defined set, but
there
> > is only a mapping algorithm implemented in the instruction selection
phase
> > "table driven", the table is encoded as metadata.
> >
> > I think its fair to have this be dealt with by targets instead of the
> > front-end.  That way the optimizer can remain generic and use only the
> > metadata.  CPU targets will just map every address space to 0 as they
have
> > only a single physical memory space.  GPU targets such as PTX and R600
can
> > map to the actual HW spaces they want.
> >
> >
> > Why a backend should be responsible (meaning have knowledge) for a
mapping
> > between high level address spaces and low level address spaces?
> >
> > Thats true.  I’m thinking entirely from the persecutive of the backend
> > doing CL/CUDA.  But actually LLVM is language agnostic.  That is still
> > something the metadata could solve.  The front-end could generate the
> > metadata i suggested earlier which will tell the backend how to do the
> > mapping.  Then the backend only needs to read the metadata.
> >
> >
> > Why X86 backend should be aware of opencl address spaces or any other
> > address spaces?
> >
> > The only reason i can think of is that this allows the address space
alias
> > analysis to occur, and all of the optimizations you might want to
implement
> > on top of it.  Otherwise you’ll need the front-end to put everything
in
> > address space 0 and you’ll have lost some opportunity to optimize in
that
> > way for x86.
> >
> >
> > Like for other aspects I see more direct and intuitive to anticipate
> > target information in the frontend (this is already done and accepted)
to
> > have a middle-end and back-end source language dependent (no specific
> > language knowledge is required, because different frontends could be
built
> > on top of this).
> >
> > Maybe a way to decouple the frontend and the specific target is
possible
> > in order to have in the target independent part of the code-generator
a
> > support for a set of language with common concept (like opencl/cuda)
but
> > it's still language dependent!
> >
> > Yes, that could work.  Actually the numbers are probably not the
important
> > thing.  Its the names that really tell you what the address space is
for.
> >  The backend needs to know what loading from a local means.  Its
almost
> > unimportant what specific number a front-end chooses for that address
> > space.  We know the front-end is really going to choose 2 (from what
you
> > said earlier), but the backend just needs to know how to load/store a
local.
> >
> > So perhaps the front-end should really be generating metadata which
tells
> > the target what address space it chose for a memory space.  That is
> >
> > !private_memory = metadata !{ i32 0 }
> > !global_memory = metadata !{ i32 1 }
> > !local_memory = metadata !{ i32 2 }
> > !constant_memory = metadata !{ i32 3 }
> >
> 
> This is specific to an OpenCL front-end.  How would this translate to a
> language with a different memory hierarchy?
> 
> I would also like to preserve the ability for front-ends to directly assign
> address spaces in a target-dependent manner.  Currently, I can write IR
> that explicitly assigns global variables to the PTX "shared"
address space
> (for example).  Under this proposal, I would need to use address space 2
> (because that is what has been decreed as OpenCL "local"), and
insert
> meta-data that tells the PTX back-end to map this to its "shared"
address
> space.  Is that correct?
> 
I agree with Justin here.  I prefer having the address spaces be
consistent across all languages.  If we have to start using metadata to
describe the address spaces, there will be information loss (e.g. GLSL
private memory may not be the same as OpenCL private memory).

Also, I'm not sure I understand what the advantage would be of using
metadata, is it only to make alias analysis easier?

-Tom> 
> >
> > Unfortunately you’d have to essentially reserve those metadata names
for
> > your use (better names than i chose of course), but this might be
> > reasonable.  You could alternately use the example I first gave, but
just
> > add a name field to it.
> >
> > I guess targets would have to either assert or default to address
space 0
> > when they see an address space without associated metadata.
> >
> >
> > This way you have the target specific information in the backend where
I
> > believe it should be, and the front-end can target agnostic (note, I
know,
> > its not really agnostic and already contains target specific
information,
> > but I just don’t want to add more unless its really needed)
> >
> > On the casting between address spaces topic "you can cast between
the
> > generic address space and global/local/private, so there's also
that to
> > consider.”.  This terrifies me.  I don’t know how to generate code for
this
> > on a system which has disjoint physical memory without branching on
every
> > memory access to that address space.
> >
> >
> > The OpenCL 2.0 specification says that a runtime resolution to a named
> > address spaced is required in order to use a pointer in the generic
address
> > space.
> >
> > Ouch!  I can’t imagine thats good for performance on some
architectures.
> >  But at least its been considered and defined.
> >
> > Pete
> >
> >
> >
> > -Michele
> >
> >
> >
> 
> 
> -- 
> 
> Thanks,
> 
> Justin Holewinski
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Aug 2013 - [LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

Possibly Parallel Threads