thr3ads.net - llvm dev - [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces [Oct 2011]

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne

2011-Oct-14 16:55 UTC

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

On Thu, Oct 13, 2011 at 04:21:23PM -0400, Justin Holewinski
wrote:> On Thu, Oct 13, 2011 at 4:16 PM, Peter Collingbourne <peter at
pcc.me.uk>wrote:
> 
> > On Thu, Oct 13, 2011 at 06:59:47PM +0000, Villmow, Micah wrote:
> > > Justin,
> > >  Out of these options, I would take the metadata approach for AA
support.
> > >
> > > This doesn't solve the problem of different frontend/backends
choosing
> > different
> > > address space representations for the same language, but is the
correct
> > > approach for providing extra information to the optimizations.
> > >
> > > The issue about memory spaces in general is a little different.
For
> > example, based on
> > > the code you posted below, address space 0(default) is global in
CUDA,
> > but
> > > in OpenCL, the default address space is private. So, how does the
ptx
> > backend
> > > handle the differences? I think this is problematic as address
spaces
> > > are language constructs and hardcoded at the frontend, but the
backend
> > needs to be
> > > able to interpret them differently based on the source language.
> > >
> > > One way this could be done is to have the backends have options,
but then
> > > each backend would need to implement this. I think a better
approach is
> > > to have some way to represent address spaces generically in the
module.
> >
> > Address space 0 (i.e. the default address space) should always be the
> > address space on which the stack resides.  This is a requirement for
> > alloca to work correctly.  So for PTX, I think that address space 0
> > should be the local state space (but I noticed that at the moment it
> > is the global state space, which seems wrong IMHO).
> >
> 
> This is a bit hacky in the back-end at the moment.  When I started working
> with the back-end, address space 0 was already defined as global, and I
have
> not broken that convention yet.
> 
> Then again, the issue is not really that big of a deal, since we need to
> specially handle all "stack" accesses anyway.  It doesn't
really matter much
> what address space is used.
What kind of special handling would be required?  And how can you
always tell whether or not an access through address space 0 would
be a stack access?  For example, consider the attached .ll file,
which compiles to a global store here.

Thanks,
-- 
Peter
-------------- next part --------------
target datalayout = "e-p:32:32-i64:64:64-f64:64:64-n1:8:16:32:64"
target triple = "ptx32--"

@g = common global i32 0, align 4

define ptx_kernel void @foo(i32 %pred) nounwind noinline {
entry:
  %p = alloca i32, align 4
  %tobool = icmp ne i32 %pred, 0
  %g.p = select i1 %tobool, i32* @g, i32* %p
  store i32 1, i32* %g.p, align 4, !tbaa !1
  ret void
}

!0 = metadata !{metadata !"int", metadata !1}
!1 = metadata !{metadata !"omnipotent char", metadata !2}
!2 = metadata !{metadata !"Simple C/C++ TBAA", null}

Villmow, Micah

2011-Oct-14 18:32 UTC

head link

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

> -----Original Message-----
> From: Peter Collingbourne [mailto:peter at pcc.me.uk]
> Sent: Friday, October 14, 2011 9:55 AM
> To: Justin Holewinski
> Cc: Villmow, Micah; LLVM Developers Mailing List
> Subject: Re: [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory
> Spaces
> 
> On Thu, Oct 13, 2011 at 04:21:23PM -0400, Justin Holewinski wrote:
> > On Thu, Oct 13, 2011 at 4:16 PM, Peter Collingbourne
> <peter at pcc.me.uk>wrote:
> >
> > > On Thu, Oct 13, 2011 at 06:59:47PM +0000, Villmow, Micah wrote:
> > > > Justin,
> > > >  Out of these options, I would take the metadata approach
for AA
> support.
> > > >
> > > > This doesn't solve the problem of different
frontend/backends
> > > > choosing
> > > different
> > > > address space representations for the same language, but is
the
> > > > correct approach for providing extra information to the
> optimizations.
> > > >
> > > > The issue about memory spaces in general is a little
different.
> > > > For
> > > example, based on
> > > > the code you posted below, address space 0(default) is
global in
> > > > CUDA,
> > > but
> > > > in OpenCL, the default address space is private. So, how
does the
> > > > ptx
> > > backend
> > > > handle the differences? I think this is problematic as
address
> > > > spaces are language constructs and hardcoded at the
frontend, but
> > > > the backend
> > > needs to be
> > > > able to interpret them differently based on the source
language.
> > > >
> > > > One way this could be done is to have the backends have
options,
> > > > but then each backend would need to implement this. I think
a
> > > > better approach is to have some way to represent address
spaces
> generically in the module.
> > >
> > > Address space 0 (i.e. the default address space) should always be
> > > the address space on which the stack resides.  This is a
> requirement
> > > for alloca to work correctly.  So for PTX, I think that address
> > > space 0 should be the local state space (but I noticed that at
the
> > > moment it is the global state space, which seems wrong IMHO).
> > >
> >
> > This is a bit hacky in the back-end at the moment.  When I started
> > working with the back-end, address space 0 was already defined as
> > global, and I have not broken that convention yet.
> >
> > Then again, the issue is not really that big of a deal, since we need
> > to specially handle all "stack" accesses anyway.  It
doesn't really
> > matter much what address space is used.
> 
> What kind of special handling would be required?  And how can you
> always tell whether or not an access through address space 0 would be a
> stack access?  For example, consider the attached .ll file, which
> compiles to a global store here.[Villmow, Micah] If this was generated from OpenCL, then it is an invalid
program as the default address space is private to the thread and you cannot
have global variables in the private address space.> 
> Thanks,
> --
> Peter

Peter Collingbourne

2011-Oct-14 19:03 UTC

head link

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

On Fri, Oct 14, 2011 at 06:32:37PM +0000, Villmow, Micah
wrote:> > What kind of special handling would be required?  And how can you
> > always tell whether or not an access through address space 0 would be
a
> > stack access?  For example, consider the attached .ll file, which
> > compiles to a global store here.
> [Villmow, Micah] If this was generated from OpenCL, then it is an invalid
program as the default address space is private to the thread and you cannot
have global variables in the private address space.
Indeed, but it is (at present) a valid LLVM IR for PTX.  The .ll file
illustrates the issue with having address space 0 map to the global
state space, as it does in the current PTX backend.

Thanks,
-- 
Peter

Justin Holewinski

2011-Oct-15 00:58 UTC

head link

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

On Fri, Oct 14, 2011 at 9:55 AM, Peter Collingbourne <peter at
pcc.me.uk>wrote:
> On Thu, Oct 13, 2011 at 04:21:23PM -0400, Justin Holewinski wrote:
> > On Thu, Oct 13, 2011 at 4:16 PM, Peter Collingbourne <peter at
pcc.me.uk
> >wrote:
> >
> > > On Thu, Oct 13, 2011 at 06:59:47PM +0000, Villmow, Micah wrote:
> > > > Justin,
> > > >  Out of these options, I would take the metadata approach
for AA
> support.
> > > >
> > > > This doesn't solve the problem of different
frontend/backends
> choosing
> > > different
> > > > address space representations for the same language, but is
the
> correct
> > > > approach for providing extra information to the
optimizations.
> > > >
> > > > The issue about memory spaces in general is a little
different. For
> > > example, based on
> > > > the code you posted below, address space 0(default) is
global in
> CUDA,
> > > but
> > > > in OpenCL, the default address space is private. So, how
does the ptx
> > > backend
> > > > handle the differences? I think this is problematic as
address spaces
> > > > are language constructs and hardcoded at the frontend, but
the
> backend
> > > needs to be
> > > > able to interpret them differently based on the source
language.
> > > >
> > > > One way this could be done is to have the backends have
options, but
> then
> > > > each backend would need to implement this. I think a better
approach
> is
> > > > to have some way to represent address spaces generically in
the
> module.
> > >
> > > Address space 0 (i.e. the default address space) should always be
the
> > > address space on which the stack resides.  This is a requirement
for
> > > alloca to work correctly.  So for PTX, I think that address space
0
> > > should be the local state space (but I noticed that at the moment
it
> > > is the global state space, which seems wrong IMHO).
> > >
> >
> > This is a bit hacky in the back-end at the moment.  When I started
> working
> > with the back-end, address space 0 was already defined as global, and
I
> have
> > not broken that convention yet.
> >
> > Then again, the issue is not really that big of a deal, since we need
to
> > specially handle all "stack" accesses anyway.  It
doesn't really matter
> much
> > what address space is used.
>
> What kind of special handling would be required?  And how can you
> always tell whether or not an access through address space 0 would
> be a stack access?  For example, consider the attached .ll file,
> which compiles to a global store here.
>
Yes, this is currently an issue with the back-end.  The handling of stack
space is definitely a hack at the moment, but I have not had the time to
address it since it currently works in the typical use case.

>
> Thanks,
> --
> Peter
>


-- 

Thanks,

Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111014/0c0acbbe/attachment.html>

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Oct 2011 - [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces

Maybe Matching Threads