thr3ads.net - llvm dev - [LLVMdev] Address space extension [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Michele Scandale

2013-Aug-07 20:52 UTC

[LLVMdev] Address space extension

Hello to everybody,

I would like to start a discussion about a possible extension of address 
space concept in LLVM.

The idea was born starting from this discussion in the clang mailing 
list (first msg: 
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130715/084011.html 
- interesting point: 
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130722/084499.html)
where the fact that "source language level" informations about address
spaces can be useful to perform optimizations in the middle end.

IMHO this information should be a plus that could be *safely* ignored 
when not necessary and used where it can provide an improvement in 
optimizations. This does not necessary mean the the middle-end (and the 
back-ends) must be aware of the semantic of these logical address 
spaces, it would be enough just to distinguish between two logically 
different address spaces.
The first application I see is alias analysis: for targets that do not 
have different physical address spaces (e.g. X86), meaning that in the 
IR the 'addrspace' modifier *should* not be present, the knowledge that 
two pointers refers to different logical address spaces (e.g. OpenCL 
address spaces) can be used to decide the aliasing.

Currently the 'addrspace' modifier refers to target defined address 
spaces (physical address spaces), so I would like to know if this 
extension is a reasonable approach.
Otherwise changing the 'addrspace' semantic could allow an alternative 
way: the middle end would be "automatically" aware of this information
but the address space lowering must be moved elsewhere before the 
instruction selection using some language-specific pass the produce the 
correct lowering. An issue with this approach is that the 
middle-end/back-end pipeline it will rely on a language specific pass or 
equivalent mechanism during the instruction selection.

Thanks in advance for the attention and for your future answer.

Best regards,

Michele Scandale

Matt Arsenault

2013-Aug-07 21:07 UTC

head link

[LLVMdev] Address space extension

On 08/07/2013 01:52 PM, Michele Scandale wrote:>
> IMHO this information should be a plus that could be *safely* ignored 
> when not necessary and used where it can provide an improvement in 
> optimizations. This does not necessary mean the the middle-end (and 
> the back-ends) must be aware of the semantic of these logical address 
> spaces, it would be enough just to distinguish between two logically 
> different address spaces.
> The first application I see is alias analysis: for targets that do not 
> have different physical address spaces (e.g. X86), meaning that in the 
> IR the 'addrspace' modifier *should* not be present, the knowledge 
> that two pointers refers to different logical address spaces (e.g. 
> OpenCL address spaces) can be used to decide the aliasing.
>
>There was this patch from a long time ago that never went in to use the 
address spaces for alias analysis:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20111010/129728.html

The decision seems to be that LLVM addrspaces aren't required to not 
alias. I was thinking of following the suggestion to make the datalayout 
contain which address spaces can / cannot alias. Alternatively, the tbaa 
metadata might be appropriate for this, but I haven't looked at how that 
works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130807/f593d344/attachment.html>

Michele Scandale

2013-Aug-07 21:33 UTC

head link

[LLVMdev] Address space extension

> There was this patch from a long time ago that never went in to use the
> address spaces for alias analysis:
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20111010/129728.html
>
> The decision seems to be that LLVM addrspaces aren't required to not
> alias. I was thinking of following the suggestion to make the datalayout
> contain which address spaces can / cannot alias. Alternatively, the tbaa
> metadata might be appropriate for this, but I haven't looked at how
that
> works.
Uhm... the fact that that different address spaces may alias is a 
problem: target address spaces may alias for whatever reason... This is 
an additional aspect that must be analyzed.
Beyond this, my proposal is about adding in a separate way the high 
level information to handle that by itself especially for those targets 
that do not uses different address spaces in the clang description 
target-info like X86. As said in the clang mailing list I think is not 
correct to cheat with the translation map to use the IR address spaces 
represent OpenCL like address spaces.

With this additional information IMO would be easier to teach the alias 
analyzer: by construction the logical address spaces should be 
considered disjoint, so having the information about physical and 
logical address spaces separated would be fine for the aliasing problem.

What do you think about this?

Pete Cooper

2013-Aug-07 21:34 UTC

head link

[LLVMdev] Address space extension

On Aug 7, 2013, at 2:07 PM, Matt Arsenault <Matthew.Arsenault at amd.com>
wrote:
> On 08/07/2013 01:52 PM, Michele Scandale wrote:
>> 
>> IMHO this information should be a plus that could be *safely* ignored
when not necessary and used where it can provide an improvement in
optimizations. This does not necessary mean the the middle-end (and the
back-ends) must be aware of the semantic of these logical address spaces, it
would be enough just to distinguish between two logically different address
spaces.
>> The first application I see is alias analysis: for targets that do not
have different physical address spaces (e.g. X86), meaning that in the IR the
'addrspace' modifier *should* not be present, the knowledge that two
pointers refers to different logical address spaces (e.g. OpenCL address spaces)
can be used to decide the aliasing.
>> 
>> 
> There was this patch from a long time ago that never went in to use the
address spaces for alias analysis:
>
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20111010/129728.html
> 
> The decision seems to be that LLVM addrspaces aren't required to not
alias. I was thinking of following the suggestion to make the datalayout contain
which address spaces can / cannot alias. Alternatively, the tbaa metadata might
be appropriate for this, but I haven't looked at how that works.I haven’t thought about using TBAA metadata, but I think some form of metadata
would be useful here.

In the clang discussion
(http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130715/084083.html)
you noted that address space 3 could be considered constant.  This is a very
useful piece of information in itself, and is something we should have in the
metadata.

I don’t know if CUDA has aliasing address spaces, but that would also be useful
to consider.  Something simple like this might work.  Note i’m using the
examples from the clang discussion, that is "1 = opencl/cuda global, 2 =
opencl_local/cuda_shared, 3 = opencl/cuda constant"

!address_spaces = !{!0, !1, !2, !3}

; Address space tuple.  { address space number, parent address space, additional
properties }
!0 = metadata !{ i32 0, !{}, !{} }
!1 = metadata !{ i32 1, !0, !{} }
!2 = metadata !{ i32 2, !0, !{} }
!3 = metadata !{ i32 3, !0, !4 }

!4 = metadata !{ “constant” }

This corresponds to 3 address spaces which all are members of address space 0,
but which otherwise do not alias each other.  I think this is roughly how TBAA
does things.  You can introduce any nodes in the tree of address spaces you need
to make children in the tree alias each other.

Additionally, the last address space is marked as constant which could be used
for optimization, e.g. LICM.

The alternative to this is to put everything in LLVM code itself.  Personally I
think metadata is better, but were it hard coded in the LLVM code i wouldn’t
argue against it.

Thanks,
Pete> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130807/ffa2a76d/attachment.html>

Michele Scandale

2013-Aug-10 06:46 UTC

head link

[LLVMdev] Address space extension

(The previous sent message had some issues...)

----------------------------------------------

Hello to everybody,

I just want make a quick summary:

OBJECTIVE: discuss for finding a way to represent logical (derived from
source language abstractions) address space also in the IR to be able to
exploit this information for optimizations.

CURRENT STATE:
The partial workaround probably used IMO is: use custom address spaces
(the semantic 'addrspace' modifier is a target-specific) so that in the
IR I still have this knowledge and "cross-fingers" for the code
generation (implicit assumption, backends with no different physical
address spaces generally ignore this information).
In the middle-end the information may be used, in the backend the only
knowledge is indirect for those load/store instructions that are still
linked to an IR Value*.

MY CRITIC TO THE CURRENT STATE:
'addrspace' modifier is not used correctly, crossing fingers hoping that
everything would be fine in the backend is not generally acceptable: I
would like a mapping explicit phase (for a specific target may be a
NO-OP phase) with a verification of consistency. Considering the case of
targets with an unique address space, I still would like to exploit high
level informations about address spaces for optimizations.

TEMP PROPOSAL
*logical address spaces relationships*: using TBAA style metadata, this
would allow to improve the alias analysis and know some features of
logical address spaces e.g. constant space for LICM.

*mapping information*:
a) use metadata -- but from metadata should not depend the correctness
b) something like data layout string -- still not discussed, but it
exposes a static property in the IR making it totally target dependent
c) encoding the mapping in backends -- IMO a bad idea because backend
would be language aware
d) backends expose address spaces detailed description to allow high
layer (the frontend) to make the mapping description based -- I don't
see how to handle cases with only one physical address space
e) create a language descriptive pass that is responsible to handle
language specific and target dependent information like address space
mapping -- a design issue is that the compiler pipeline will depend on a
language specific pass, there would be possible issue with independent
tools like opt, llc, etc... waiting some feedback

RELATED PROBLEMS
1) addrspacecast support: approved but not implemented yet. It will
cleanup address spaces conversions localized in a single instruction
allowing each target to define the semantic...
2) code generator does not support different pointer types (at least one
for each physical address space).


Something missing or incorrect or wrong? Opinions?

Thanks in advance to everybody.

-Michele

Sergey Yakoushkin

2013-Aug-10 10:29 UTC

head link

[LLVMdev] Address space extension

Hi Michele,

Are you considering nested address spaces?

Apart from OpenCL, named address spaces have been proposed in scope of
"Embedded C" draft N1275 (2007).
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1275.pdf

--> Section 5.2: Named address spaces and named-register storage classes

Summarizing, address spaces may overlap in a nested fashion. Typically
their names are intrinsic identifiers (e.g. "_x int t;" ) predefined
at the
start of translation unit,
but draft also mentions optional support for user-defined address spaces.



On Thu, Aug 8, 2013 at 12:52 AM, Michele Scandale <
michele.scandale at gmail.com> wrote:
> Hello to everybody,
>
> I would like to start a discussion about a possible extension of address
> space concept in LLVM.
>
> The idea was born starting from this discussion in the clang mailing list
> (first msg: http://lists.cs.uiuc.edu/**pipermail/cfe-commits/Week-of-**
>
Mon-20130715/084011.html<http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130715/084011.html>-
interesting point:
> http://lists.cs.uiuc.edu/**pipermail/cfe-commits/Week-of-**
>
Mon-20130722/084499.html<http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20130722/084499.html>)
> where the fact that "source language level" informations about
address
> spaces can be useful to perform optimizations in the middle end.
>
> IMHO this information should be a plus that could be *safely* ignored when
> not necessary and used where it can provide an improvement in
> optimizations. This does not necessary mean the the middle-end (and the
> back-ends) must be aware of the semantic of these logical address spaces,
> it would be enough just to distinguish between two logically different
> address spaces.
> The first application I see is alias analysis: for targets that do not
> have different physical address spaces (e.g. X86), meaning that in the IR
> the 'addrspace' modifier *should* not be present, the knowledge
that two
> pointers refers to different logical address spaces (e.g. OpenCL address
> spaces) can be used to decide the aliasing.
>
> Currently the 'addrspace' modifier refers to target defined address
spaces
> (physical address spaces), so I would like to know if this extension is a
> reasonable approach.
> Otherwise changing the 'addrspace' semantic could allow an
alternative
> way: the middle end would be "automatically" aware of this
information but
> the address space lowering must be moved elsewhere before the instruction
> selection using some language-specific pass the produce the correct
> lowering. An issue with this approach is that the middle-end/back-end
> pipeline it will rely on a language specific pass or equivalent mechanism
> during the instruction selection.
>
> Thanks in advance for the attention and for your future answer.
>
> Best regards,
>
> Michele Scandale
> ______________________________**_________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>
http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130810/6cd0bcaf/attachment.html>

Michele Scandale

2013-Aug-10 11:59 UTC

head link

[LLVMdev] Address space extension

Hi Sergey,

On 08/10/2013 12:29 PM, Sergey Yakoushkin wrote:> Hi Michele,
> 
> Are you considering nested address spaces?
> 
> Apart from OpenCL, named address spaces have been proposed in scope of
> "Embedded C" draft N1275 (2007).
> http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1275.pdf
> 
> --> Section 5.2: Named address spaces and named-register storage classes
> 
> Summarizing, address spaces may overlap in a nested fashion. Typically
> their names are intrinsic identifiers (e.g. "_x int t;" )
predefined at
> the start of translation unit,
> but draft also mentions optional support for user-defined address spaces.
> 
I've quickly read the document... about the nesting of overlap I don't
see any problem, there are also constraints relative to the form of
overlapping (two AS can be disjoint, equivalent or subset... pure
intersection is not allowed).

The part related of address space linked to device-driver memory, IMO is
something that the frontend must fully lower to function calls whose
name must be provided by the registration mechanism of the address
space: so nothing relevant to the IR would be here.

The remaining address spaces (they will be a subset of what in that
document is named intrinsic address space) should be the one that the
target (the backend of the compiler) knows: so this informations should
be also in the frontend to ensure a correct behavior in the backend. I
think such address spaces can be handled by the address_space attribute
supported by clang now as internal mechanism.

Maybe (or probably) I am missing something...

Thanks.

-Michele

Iain Sandoe

2013-Aug-10 13:07 UTC

head link

[LLVMdev] Address space extension

Hello all .. my first post to this thread (and this list, as it happens)..

On 10 Aug 2013, at 07:46, Michele Scandale wrote:
> I just want make a quick summary:
a summary is indeed useful.
> OBJECTIVE: discuss for finding a way to represent logical (derived from
> source language abstractions) address space also in the IR to be able to
> exploit this information for optimizations.
Seems an excellent objective.

It would seem to me that there are two (potentially overlapping, but distinct)
pieces of information,

A. The FE language rules for the semantics of whatever address spaces it defines
(which might even vary depending on compile-flags - e.g. a flag introducing some
language-defined parallel shared space).  Certainly these rules will vary in
interesting and non-obvious ways between different languages.

B. The target's definition of the physical address spaces that it provides,
and the rules that apply for determining interactions between these.

[It is agreed that address space attribute markup allows one to indicate (B) to
the FE - but, as I understand things, the FE merely passes that information
along at present**, and there is no way to specify the rules for operations
between spaces]

.. please forgive me if I'm restating things (or have missed a vital point)
- but trying to summarise in my own head:

My concern is that in order that the IR / optimisers remain agnostic to both
target and FE these two pieces would seem ideally provided separately?

that is the IR / optimisers need the union of those two pieces of information -
trying to amalgamate them seems risky, it would appear cleaner to specify them
independently.

FWIW, I completely concur with the need for type info on pointers - for my
use-case it would be helpful too.
> RELATED PROBLEMS
> 1) addrspacecast support: approved but not implemented yet. It will
> cleanup address spaces conversions localized in a single instruction
> allowing each target to define the semantic...
Is anyone known to be working on this?
> 2) code generator does not support different pointer types (at least one
> for each physical address space).
this is a very useful capability .... even without (1) and the disussion above.

( It is not my intention to hijack this thread )
- but I'd like to get some clarification on where we are with this and
what's needed to move forward.

I have an out-of-tree port that would (very much) like to make use of
address-spaces - but can't until this is working.

 - characteristics share some features with other posters on this thread:

 * different sized address regs
 * most address regs are larger than int regs
 * address regs only support a sub-set of operations
 * some address spaces are disjoint, some may overlap.

so far, I declared a data layout with  "-p0:xx:yy-pp1:zz:aa-p2:... " 
etc.
... implemented getPointerWidthV() (and getPointerRegClass(), which never seems
to get called).

I see that global data items are correctly sized according to my pointer
definitions - but that all operations are carried out as if pointers were sized
for address space 0.

What is required to (at minimum) get the different size pointers working?

Whilst the points at (1) above are completely agreed - one can make good
progress by starting with (2) - and making it the User's responsibility to
obey the casting and overlap rules whilst (1) is implemented.

So .. (perhaps this should be a new thread) -
 I'd like to understand what is needed (in the short-term) to do:
 (2) then (1) ..  
 is it a question of updating Micah's patch (for 2) .. and starting from
scratch for (1)?

thanks
Iain

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Aug 2013 - [LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

[LLVMdev] Address space extension

Maybe Matching Threads