Peter Collingbourne
2011-Oct-13 20:16 UTC
[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
On Thu, Oct 13, 2011 at 06:59:47PM +0000, Villmow, Micah wrote:> Justin, > Out of these options, I would take the metadata approach for AA support. > > This doesn't solve the problem of different frontend/backends choosing different > address space representations for the same language, but is the correct > approach for providing extra information to the optimizations. > > The issue about memory spaces in general is a little different. For example, based on > the code you posted below, address space 0(default) is global in CUDA, but > in OpenCL, the default address space is private. So, how does the ptx backend > handle the differences? I think this is problematic as address spaces > are language constructs and hardcoded at the frontend, but the backend needs to be > able to interpret them differently based on the source language. > > One way this could be done is to have the backends have options, but then > each backend would need to implement this. I think a better approach is > to have some way to represent address spaces generically in the module.Address space 0 (i.e. the default address space) should always be the address space on which the stack resides. This is a requirement for alloca to work correctly. So for PTX, I think that address space 0 should be the local state space (but I noticed that at the moment it is the global state space, which seems wrong IMHO). As I mentioned in my previous email, I don't think that the backend should interpret address spaces for the source language, as this places too much language-specific functionality in the backend. The situation regarding default address spaces in CUDA is more complex, but suffice it to say that there is usually no such thing as a "default" address space in CUDA, because the language does not contain support for address space qualified pointer types (only address space qualified declarations). NVIDIA's CUDA compiler, nvopencc, determines the correct address space for each pointer using type inference (there is an explanation of nvopencc's algorithm in the src/doc/ssa_memory_space.txt file in the nvopencc distribution). Our compiler should eventually contain a similar algorithm. Thanks, -- Peter
Justin Holewinski
2011-Oct-13 20:21 UTC
[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
On Thu, Oct 13, 2011 at 4:16 PM, Peter Collingbourne <peter at pcc.me.uk>wrote:> On Thu, Oct 13, 2011 at 06:59:47PM +0000, Villmow, Micah wrote: > > Justin, > > Out of these options, I would take the metadata approach for AA support. > > > > This doesn't solve the problem of different frontend/backends choosing > different > > address space representations for the same language, but is the correct > > approach for providing extra information to the optimizations. > > > > The issue about memory spaces in general is a little different. For > example, based on > > the code you posted below, address space 0(default) is global in CUDA, > but > > in OpenCL, the default address space is private. So, how does the ptx > backend > > handle the differences? I think this is problematic as address spaces > > are language constructs and hardcoded at the frontend, but the backend > needs to be > > able to interpret them differently based on the source language. > > > > One way this could be done is to have the backends have options, but then > > each backend would need to implement this. I think a better approach is > > to have some way to represent address spaces generically in the module. > > Address space 0 (i.e. the default address space) should always be the > address space on which the stack resides. This is a requirement for > alloca to work correctly. So for PTX, I think that address space 0 > should be the local state space (but I noticed that at the moment it > is the global state space, which seems wrong IMHO). >This is a bit hacky in the back-end at the moment. When I started working with the back-end, address space 0 was already defined as global, and I have not broken that convention yet. Then again, the issue is not really that big of a deal, since we need to specially handle all "stack" accesses anyway. It doesn't really matter much what address space is used.> > As I mentioned in my previous email, I don't think that the backend > should interpret address spaces for the source language, as this > places too much language-specific functionality in the backend. > > The situation regarding default address spaces in CUDA is more > complex, but suffice it to say that there is usually no such thing > as a "default" address space in CUDA, because the language does not > contain support for address space qualified pointer types (only address > space qualified declarations). NVIDIA's CUDA compiler, nvopencc, > determines the correct address space for each pointer using type > inference (there is an explanation of nvopencc's algorithm in the > src/doc/ssa_memory_space.txt file in the nvopencc distribution). > Our compiler should eventually contain a similar algorithm. > > Thanks, > -- > Peter >-- Thanks, Justin Holewinski -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20111013/68aa524a/attachment.html>
Peter Collingbourne
2011-Oct-14 16:55 UTC
[LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
On Thu, Oct 13, 2011 at 04:21:23PM -0400, Justin Holewinski wrote:> On Thu, Oct 13, 2011 at 4:16 PM, Peter Collingbourne <peter at pcc.me.uk>wrote: > > > On Thu, Oct 13, 2011 at 06:59:47PM +0000, Villmow, Micah wrote: > > > Justin, > > > Out of these options, I would take the metadata approach for AA support. > > > > > > This doesn't solve the problem of different frontend/backends choosing > > different > > > address space representations for the same language, but is the correct > > > approach for providing extra information to the optimizations. > > > > > > The issue about memory spaces in general is a little different. For > > example, based on > > > the code you posted below, address space 0(default) is global in CUDA, > > but > > > in OpenCL, the default address space is private. So, how does the ptx > > backend > > > handle the differences? I think this is problematic as address spaces > > > are language constructs and hardcoded at the frontend, but the backend > > needs to be > > > able to interpret them differently based on the source language. > > > > > > One way this could be done is to have the backends have options, but then > > > each backend would need to implement this. I think a better approach is > > > to have some way to represent address spaces generically in the module. > > > > Address space 0 (i.e. the default address space) should always be the > > address space on which the stack resides. This is a requirement for > > alloca to work correctly. So for PTX, I think that address space 0 > > should be the local state space (but I noticed that at the moment it > > is the global state space, which seems wrong IMHO). > > > > This is a bit hacky in the back-end at the moment. When I started working > with the back-end, address space 0 was already defined as global, and I have > not broken that convention yet. > > Then again, the issue is not really that big of a deal, since we need to > specially handle all "stack" accesses anyway. It doesn't really matter much > what address space is used.What kind of special handling would be required? And how can you always tell whether or not an access through address space 0 would be a stack access? For example, consider the attached .ll file, which compiles to a global store here. Thanks, -- Peter -------------- next part -------------- target datalayout = "e-p:32:32-i64:64:64-f64:64:64-n1:8:16:32:64" target triple = "ptx32--" @g = common global i32 0, align 4 define ptx_kernel void @foo(i32 %pred) nounwind noinline { entry: %p = alloca i32, align 4 %tobool = icmp ne i32 %pred, 0 %g.p = select i1 %tobool, i32* @g, i32* %p store i32 1, i32* %g.p, align 4, !tbaa !1 ret void } !0 = metadata !{metadata !"int", metadata !1} !1 = metadata !{metadata !"omnipotent char", metadata !2} !2 = metadata !{metadata !"Simple C/C++ TBAA", null}
Maybe Matching Threads
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
- [LLVMdev] Changes to the PTX calling conventions