thr3ads.net - llvm dev - [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support

If this information is useful, please help other people find it:
Share via:

Peter Collingbourne

2011-Feb-28 21:41 UTC

[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

On Fri, Feb 25, 2011 at 02:55:33PM -0500, Ken Dyck
wrote:> The address space mechanism is used by some code generators to
> differentiate between physical memory spaces. The PIC16 code generator
> uses address spaces 0 and 1 to select between its RAM and ROM spaces.
> And X86 uses address space 256 for GS and 257 for FS. In the back end
> for a dual-harvard DSP that I've been working on, I use address spaces
> 0-3 to designate the various memories on the machine.
> 
> The enum conflicts are easy enough to fix, but this current
> implementation doesn't seem to leave room to specify both language-
> and target-specific options on the same pointer. For example, when
> developing an app for a PIC16, how would a user specify a pointer to a
> CONSTANT variable in the ROM space?
> 
> Perhaps we could reserve separate bitfields within the address space
> number for language- and target-specific options. The OpenCL code
> would then need to shift and OR its constants with any address space
> numbers specified with the __attribute__ syntax.
The more I think about it, the more I become uncomfortable with the
concept of language-specific address spaces in LLVM.  These are the
main issues I see with language-specific address spaces:

Firstly, it forces every target to 'know' about each source language,
requiring (potentially) modification of each target for each new
frontend language with multiple targets.  This goes against the LLVM
design principle of language independence, and encourages frontends
to reuse (abuse?) address spaces which are meant for other languages.

Secondly, consider the issue of language interoperability (e.g. a
hypothetical CUDA <-> OpenCL interop layer) -- we either lose the
ability to pass pointers between languages in a type-safe way or end
up giving awkward names to address spaces.

Instead of language-specific address spaces, each target should
concentrate on exposing all of its address spaces as target-specific
address spaces, and frontends should use a language -> target mapping
in target-specific code.  We can continue to expose the target's main
shared writable address space as address space 0 as we do now.

For example, Clang could define a set of internal address space
constants for OpenCL and use TargetCodeGenInfo to provide the mapping
to target address spaces.

An additional benefit is that this solution would allow AMD and
other backends with non-standard orderings [1] to retain backward
compatibility.

In Clang, by default, pointers would be in language address space 0,
which could map to any target address space (normally 0).  This neatly
resolves the "default address space" problem for devices with a
nonzero private address space (although on the LLVM side we would
need an address-space-aware alloca).

Thanks,
-- 
Peter

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-February/038199.html

David Neto

2011-Mar-01 21:06 UTC

head link

[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:>
> The more I think about it, the more I become uncomfortable with the
> concept of language-specific address spaces in LLVM.  These are the
> main issues I see with language-specific address spaces:
...
> Instead of language-specific address spaces, each target should
> concentrate on exposing all of its address spaces as target-specific
> address spaces, and frontends should use a language -> target mapping
> in target-specific code.  We can continue to expose the target's main
> shared writable address space as address space 0 as we do now.
>
> For example, Clang could define a set of internal address space
> constants for OpenCL and use TargetCodeGenInfo to provide the mapping
> to target address spaces.
In principle this is a fine idea.

I think the difficulty is that LLVM and Clang provide an
infrastructure for numbered address spaces, but no standard assignment
on top of that infrastructure.  The trick is define some conventions,
e.g. what the numbers might mean for a language front-end, and whether
the interpretation of the numbers change as the IR moves to later
stages.  We're working in a bit of a vacuum.

For example, you're proposing a remapping step somewhere along the
line: that could be entirely inside a back-end code generator.  Or it
could conceivably be an LLVM pass itself, which then could be used
with multiple backends that understand the new convention.

So I think we need a couple of things:
- proposals for number assignments and their associated semantics.
- code to flesh out and embody those semantics. e.g. a sample
implementation / translation layer

Basically Anton got the ball rolling: his code patch was a bit of
both.  And I think he's planning to post a number of OpenCL proposals
in general.

As it is, I hope that backends that do not understand address spaces
at all know to error out when they receive IR that uses address
spaces.

david

Speziale Ettore

2011-Mar-02 07:12 UTC

head link

[LLVMdev] [cfe-dev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

Hi,
> On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne <peter at
pcc.me.uk> wrote:
> >
> > The more I think about it, the more I become uncomfortable with the
> > concept of language-specific address spaces in LLVM.  These are the
> > main issues I see with language-specific address spaces:
> 
> ...
> 
> > Instead of language-specific address spaces, each target should
> > concentrate on exposing all of its address spaces as target-specific
> > address spaces, and frontends should use a language -> target
mapping
> > in target-specific code.  We can continue to expose the target's
main
> > shared writable address space as address space 0 as we do now.
> >
> > For example, Clang could define a set of internal address space
> > constants for OpenCL and use TargetCodeGenInfo to provide the mapping
> > to target address spaces.
> 
> In principle this is a fine idea.
> 
> I think the difficulty is that LLVM and Clang provide an
> infrastructure for numbered address spaces, but no standard assignment
> on top of that infrastructure.  The trick is define some conventions,
> e.g. what the numbers might mean for a language front-end, and whether
> the interpretation of the numbers change as the IR moves to later
> stages.  We're working in a bit of a vacuum.
> 
> For example, you're proposing a remapping step somewhere along the
> line: that could be entirely inside a back-end code generator.  Or it
> could conceivably be an LLVM pass itself, which then could be used
> with multiple backends that understand the new convention.
> 
> So I think we need a couple of things:
> - proposals for number assignments and their associated semantics.
> - code to flesh out and embody those semantics. e.g. a sample
> implementation / translation layer
> 
> Basically Anton got the ball rolling: his code patch was a bit of
> both.  And I think he's planning to post a number of OpenCL proposals
> in general.
> 
> As it is, I hope that backends that do not understand address spaces
> at all know to error out when they receive IR that uses address
> spaces.
The OpenCL standard talks about addess spaces, but I think they can be
interpreted as scopes (except __constants):

* __global: globally accessible variables
* __private: visible only to a work item
* __local: accessible by all work item in a work group

The address space is the way scoping rules are implemented in hardware,
e.g __local variables are mapped in the address space X which is a fast
memory shared by all ALU inside a GPU multiprocessor. Maybe introducing
such "scopes", it is possible to decouple backends fom frontends.

__constant is a corner case: it can be modelled as a global scope that
contains read only data

Have a nice day,
speziale.ettore at gmail.com

Ken Dyck

2011-Mar-02 14:38 UTC

head link

[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

On Tue, Mar 1, 2011 at 4:06 PM, David Neto wrote:> On Mon, Feb 28, 2011 at 4:41 PM, Peter Collingbourne wrote:
>>
>> The more I think about it, the more I become uncomfortable with the
>> concept of language-specific address spaces in LLVM.  These are the
>> main issues I see with language-specific address spaces:
>
> ...
>
>> Instead of language-specific address spaces, each target should
>> concentrate on exposing all of its address spaces as target-specific
>> address spaces, and frontends should use a language -> target
mapping
>> in target-specific code.  We can continue to expose the target's
main
>> shared writable address space as address space 0 as we do now.
>>
>> For example, Clang could define a set of internal address space
>> constants for OpenCL and use TargetCodeGenInfo to provide the mapping
>> to target address spaces.
>
> In principle this is a fine idea.
>
> I think the difficulty is that LLVM and Clang provide an
> infrastructure for numbered address spaces, but no standard assignment
> on top of that infrastructure.
You can trace back the origins of the addrspace attribute in the
mailing list archives to this thread:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2007-November/011385.html.>From there, it is pretty clear that addrspace was introducedspecifically as a mechanism for implementing the 'named address space'
extensions defined in the Embedded C standard (ISO/IEC TR 18037,
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1169.pdf).

The Embedded C standard gives this overview of the 'named address
space' extension:

  Many embedded processors have multiple distinct banks of memory
  and require that data be grouped in different banks to achieve
  maximum performance.  Ensuring the simultaneous flow of data and
  coefficient data to the multiplier/accumulator of processors
  designed for FIR filtering, for example, is critical to their
  operation.  In order to allow the programmer to declare the memory
  space from which a specific data object must be fetched, this
  Technical Report specifies basic support for multiple address
  spaces.  As a result, optimizing compilers can utilize the ability
  of processors that support multiple address spaces, for instance,
  to read data from two separate memories in a single cycle to
  maximize execution speed.

If you dig into the Embedded C standard, you'll find that the 'named
address space' extension is highly target-specific. It is only
portable insofar as two target processors have similar memory
organization and use identical names for their address spaces.

So the reason that there aren't any conventions for the address space
numbers in clang/llvm is because there aren't any conventions for how
chip designers incorporate memories into the architectures that they
design.

The one convention that the Embedded C standard does specify is that
when the address space of a type is unspecified, the type is assumed
to be in the 'generic' space. Clang currently emits an address space
of zero in this case. Arguably, LLVM could define a single enum value,
GENERIC, for use by the code generators.
> The trick is define some conventions,
> e.g. what the numbers might mean for a language front-end, and whether
> the interpretation of the numbers change as the IR moves to later
> stages.  We're working in a bit of a vacuum.
>
> ...
>
> So I think we need a couple of things:
> - proposals for number assignments and their associated semantics.
> - code to flesh out and embody those semantics. e.g. a sample
> implementation / translation layer
In my opinion, any knowledge that front ends have of address spaces
should be dictated by the target's back end. Perhaps we should add
some virtual methods to LLVM's TargetMachine interface so front ends
can query the back end for the names and numbers of the address spaces
that they recognize, and expose them to end users in a standard way.
But having front ends impose the requirement on back ends that they
recognize some arbitrary set of language-specific address spaces seems
like a great misuse of the feature to me for reasons that Peter has
already pointed out.
> Basically Anton got the ball rolling: his code patch was a bit of
> both.  And I think he's planning to post a number of OpenCL proposals
> in general.
It seems to me, as Speziale already pointed out, that the OpenCL type
qualifiers aren't address space qualifiers at all (in the Embedded C
sense). They might be better implemented as a separate set of
qualifiers in the way that Objective-C defines its garbage-collection
qualifiers, __strong and __weak. See the Qualifiers class in
AST/Type.h.
> As it is, I hope that backends that do not understand address spaces
> at all know to error out when they receive IR that uses address
> spaces.
This is currently not the case. The back ends for architectures that
don't have multiple address spaces simply ignore the address space
number on the address operands of load and store nodes. The back ends
that do support multiple address spaces treat any address space number
that they don't recognize in the same way that they address space 0.

-Ken

Reasonably Related Threads

Search for more possibly parallel threads

llvm dev - Mar 2011 - [LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

[LLVMdev] [cfe-dev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

[LLVMdev] Language-specific vs target-specific address spaces (was Re: [PATCH] OpenCL support - update on keywords)

Reasonably Related Threads