Peter Collingbourne
2011-Feb-28 21:41 UTC
[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)
On Fri, Feb 25, 2011 at 02:55:33PM -0500, Ken Dyck wrote:> The address space mechanism is used by some code generators to > differentiate between physical memory spaces. The PIC16 code generator > uses address spaces 0 and 1 to select between its RAM and ROM spaces. > And X86 uses address space 256 for GS and 257 for FS. In the back end > for a dual-harvard DSP that I've been working on, I use address spaces > 0-3 to designate the various memories on the machine. > > The enum conflicts are easy enough to fix, but this current > implementation doesn't seem to leave room to specify both language- > and target-specific options on the same pointer. For example, when > developing an app for a PIC16, how would a user specify a pointer to a > CONSTANT variable in the ROM space? > > Perhaps we could reserve separate bitfields within the address space > number for language- and target-specific options. The OpenCL code > would then need to shift and OR its constants with any address space > numbers specified with the __attribute__ syntax.The more I think about it, the more I become uncomfortable with the concept of language-specific address spaces in LLVM. These are the main issues I see with language-specific address spaces: Firstly, it forces every target to 'know' about each source language, requiring (potentially) modification of each target for each new frontend language with multiple targets. This goes against the LLVM design principle of language independence, and encourages frontends to reuse (abuse?) address spaces which are meant for other languages. Secondly, consider the issue of language interoperability (e.g. a hypothetical CUDA <-> OpenCL interop layer) -- we either lose the ability to pass pointers between languages in a type-safe way or end up giving awkward names to address spaces. Instead of language-specific address spaces, each target should concentrate on exposing all of its address spaces as target-specific address spaces, and frontends should use a language -> target mapping in target-specific code. We can continue to expose the target's main shared writable address space as address space 0 as we do now. For example, Clang could define a set of internal address space constants for OpenCL and use TargetCodeGenInfo to provide the mapping to target address spaces. An additional benefit is that this solution would allow AMD and other backends with non-standard orderings [1] to retain backward compatibility. In Clang, by default, pointers would be in language address space 0, which could map to any target address space (normally 0). This neatly resolves the "default address space" problem for devices with a nonzero private address space (although on the LLVM side we would need an address-space-aware alloca). Thanks, -- Peter [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-February/038199.html
David Neto
2011-Mar-01 21:06 UTC
[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)
On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:> > The more I think about it, the more I become uncomfortable with the > concept of language-specific address spaces in LLVM. These are the > main issues I see with language-specific address spaces:...> Instead of language-specific address spaces, each target should > concentrate on exposing all of its address spaces as target-specific > address spaces, and frontends should use a language -> target mapping > in target-specific code. We can continue to expose the target's main > shared writable address space as address space 0 as we do now. > > For example, Clang could define a set of internal address space > constants for OpenCL and use TargetCodeGenInfo to provide the mapping > to target address spaces.In principle this is a fine idea. I think the difficulty is that LLVM and Clang provide an infrastructure for numbered address spaces, but no standard assignment on top of that infrastructure. The trick is define some conventions, e.g. what the numbers might mean for a language front-end, and whether the interpretation of the numbers change as the IR moves to later stages. We're working in a bit of a vacuum. For example, you're proposing a remapping step somewhere along the line: that could be entirely inside a back-end code generator. Or it could conceivably be an LLVM pass itself, which then could be used with multiple backends that understand the new convention. So I think we need a couple of things: - proposals for number assignments and their associated semantics. - code to flesh out and embody those semantics. e.g. a sample implementation / translation layer Basically Anton got the ball rolling: his code patch was a bit of both. And I think he's planning to post a number of OpenCL proposals in general. As it is, I hope that backends that do not understand address spaces at all know to error out when they receive IR that uses address spaces. david
Speziale Ettore
2011-Mar-02 07:12 UTC
[LLVMdev] [cfe-dev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)
Hi,> On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne <peter at pcc.me.uk> wrote: > > > > The more I think about it, the more I become uncomfortable with the > > concept of language-specific address spaces in LLVM. These are the > > main issues I see with language-specific address spaces: > > ... > > > Instead of language-specific address spaces, each target should > > concentrate on exposing all of its address spaces as target-specific > > address spaces, and frontends should use a language -> target mapping > > in target-specific code. We can continue to expose the target's main > > shared writable address space as address space 0 as we do now. > > > > For example, Clang could define a set of internal address space > > constants for OpenCL and use TargetCodeGenInfo to provide the mapping > > to target address spaces. > > In principle this is a fine idea. > > I think the difficulty is that LLVM and Clang provide an > infrastructure for numbered address spaces, but no standard assignment > on top of that infrastructure. The trick is define some conventions, > e.g. what the numbers might mean for a language front-end, and whether > the interpretation of the numbers change as the IR moves to later > stages. We're working in a bit of a vacuum. > > For example, you're proposing a remapping step somewhere along the > line: that could be entirely inside a back-end code generator. Or it > could conceivably be an LLVM pass itself, which then could be used > with multiple backends that understand the new convention. > > So I think we need a couple of things: > - proposals for number assignments and their associated semantics. > - code to flesh out and embody those semantics. e.g. a sample > implementation / translation layer > > Basically Anton got the ball rolling: his code patch was a bit of > both. And I think he's planning to post a number of OpenCL proposals > in general. > > As it is, I hope that backends that do not understand address spaces > at all know to error out when they receive IR that uses address > spaces.The OpenCL standard talks about addess spaces, but I think they can be interpreted as scopes (except __constants): * __global: globally accessible variables * __private: visible only to a work item * __local: accessible by all work item in a work group The address space is the way scoping rules are implemented in hardware, e.g __local variables are mapped in the address space X which is a fast memory shared by all ALU inside a GPU multiprocessor. Maybe introducing such "scopes", it is possible to decouple backends fom frontends. __constant is a corner case: it can be modelled as a global scope that contains read only data Have a nice day, speziale.ettore at gmail.com
Ken Dyck
2011-Mar-02 14:38 UTC
[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)
On Tue, Mar 1, 2011 at 4:06 PM, David Neto wrote:> On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne wrote: >> >> The more I think about it, the more I become uncomfortable with the >> concept of language-specific address spaces in LLVM. These are the >> main issues I see with language-specific address spaces: > > ... > >> Instead of language-specific address spaces, each target should >> concentrate on exposing all of its address spaces as target-specific >> address spaces, and frontends should use a language -> target mapping >> in target-specific code. We can continue to expose the target's main >> shared writable address space as address space 0 as we do now. >> >> For example, Clang could define a set of internal address space >> constants for OpenCL and use TargetCodeGenInfo to provide the mapping >> to target address spaces. > > In principle this is a fine idea. > > I think the difficulty is that LLVM and Clang provide an > infrastructure for numbered address spaces, but no standard assignment > on top of that infrastructure.You can trace back the origins of the addrspace attribute in the mailing list archives to this thread: http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-November/011385.html.>From there, it is pretty clear that addrspace was introducedspecifically as a mechanism for implementing the 'named address space' extensions defined in the Embedded C standard (ISO/IEC TR 18037, http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf). The Embedded C standard gives this overview of the 'named address space' extension: Many embedded processors have multiple distinct banks of memory and require that data be grouped in different banks to achieve maximum performance. Ensuring the simultaneous flow of data and coefficient data to the multiplier/accumulator of processors designed for FIR filtering, for example, is critical to their operation. In order to allow the programmer to declare the memory space from which a specific data object must be fetched, this Technical Report specifies basic support for multiple address spaces. As a result, optimizing compilers can utilize the ability of processors that support multiple address spaces, for instance, to read data from two separate memories in a single cycle to maximize execution speed. If you dig into the Embedded C standard, you'll find that the 'named address space' extension is highly target-specific. It is only portable insofar as two target processors have similar memory organization and use identical names for their address spaces. So the reason that there aren't any conventions for the address space numbers in clang/llvm is because there aren't any conventions for how chip designers incorporate memories into the architectures that they design. The one convention that the Embedded C standard does specify is that when the address space of a type is unspecified, the type is assumed to be in the 'generic' space. Clang currently emits an address space of zero in this case. Arguably, LLVM could define a single enum value, GENERIC, for use by the code generators.> The trick is define some conventions, > e.g. what the numbers might mean for a language front-end, and whether > the interpretation of the numbers change as the IR moves to later > stages. We're working in a bit of a vacuum. > > ... > > So I think we need a couple of things: > - proposals for number assignments and their associated semantics. > - code to flesh out and embody those semantics. e.g. a sample > implementation / translation layerIn my opinion, any knowledge that front ends have of address spaces should be dictated by the target's back end. Perhaps we should add some virtual methods to LLVM's TargetMachine interface so front ends can query the back end for the names and numbers of the address spaces that they recognize, and expose them to end users in a standard way. But having front ends impose the requirement on back ends that they recognize some arbitrary set of language-specific address spaces seems like a great misuse of the feature to me for reasons that Peter has already pointed out.> Basically Anton got the ball rolling: his code patch was a bit of > both. And I think he's planning to post a number of OpenCL proposals > in general.It seems to me, as Speziale already pointed out, that the OpenCL type qualifiers aren't address space qualifiers at all (in the Embedded C sense). They might be better implemented as a separate set of qualifiers in the way that Objective-C defines its garbage-collection qualifiers, __strong and __weak. See the Qualifiers class in AST/Type.h.> As it is, I hope that backends that do not understand address spaces > at all know to error out when they receive IR that uses address > spaces.This is currently not the case. The back ends for architectures that don't have multiple address spaces simply ignore the address space number on the address operands of load and store nodes. The back ends that do support multiple address spaces treat any address space number that they don't recognize in the same way that they address space 0. -Ken
Apparently Analagous Threads
- [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)
- [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces
- [LLVMdev] [cfe-dev] RFC: Representation of OpenCL Memory Spaces