thr3ads.net - llvm dev - [LLVMdev] Casting between address spaces and address space semantics [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Matthijs Kooijman

2008-Jul-22 08:36 UTC

[LLVMdev] Casting between address spaces and address space semantics

Hi Chris,

> >> You probably want to somehow extend TargetData to encode the
address
> >> space descriptions, including pointer sizes and address space
> >> relationships.
> > Hmm, you have an excellent point there, pointer sizes could indeed 
easily
> > have different sizes in different address spaces... That would make
> > TargetData indeed a logical place to put such info.
> 
> Yep, I think it makes sense for TargetData to have info about the size/ 
> alignment of a pointer in each addr space.  That is also easy to encode.Which is an added bonus, but the original subject under discussion was the
relations between each address space (equivalent, disjoint, subset/superset).
Any thoughts on that? Should they also belong in TargetData?
> > I'm not completely sure how a bitcast from one pointer type to 
another
> > would work then, it would probably need zext in addition to casting?
> 
> In the code generator, it can know a lot more information about what  
> happens when bitcasting from one address space to another.  This  
> doesn't have to go into TargetData.Well, if the pointers really are differently sized, you couldn't just
bitcast
between them, since bitcast requires values of equal bitlength? Or is this
something that would only be exposed when doing intoptr/ptrtoint and simply
assumed to be handled in the backend for pointer-to-pointer bitcasts?
> > In this case, finding some text representation for the address space
> > relations make sense, so that these relations can be embedded into the
> > LLVM IR  (possibly inside the "target datalayout" entry, or
perhaps in a
> > separate "target addrspaces" entry?
> 
> Are you envisioning an LLVM IR pass that mucks around with these?   
> When would that be useful?Not a particular pass, but passes in general. Specifically, I would like
instcombining to be able to use this info to remove useless bitcasts. Also,
this information is useful for clang to know when inserting an implicit cast
makes sense and when it would be an error.
> You can see this in Linker::LinkModules, basically one randomly  overrides
> the other unless one modules is empty, in which case the non- empty one
> wins.Shouldn't this give a warning instead of silently picking one? Also,
couldn't
this produces an invalid module? Say, for example that one module has 16 bit
pointers and the other 32 bit. If we use 32 bit as the resulting pointer size,
but the other module contained something like
	%int = ptrtoint i32* %A to i16

This could suddenly cause truncation of the pointer, where it didn't in the
original module. Warning for this seems appropriate? The same would go for
different address space configurations, I expect mixing those could cause
nasty, but probabl very subtle bugs...

Gr.

Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080722/abc5e128/attachment.sig>

Chris Lattner

2008-Jul-22 17:12 UTC

head link

[LLVMdev] Casting between address spaces and address space semantics

On Jul 22, 2008, at 1:36 AM, Matthijs Kooijman wrote:>> Yep, I think it makes sense for TargetData to have info about the  
>> size/
>> alignment of a pointer in each addr space.  That is also easy to  
>> encode.
> Which is an added bonus, but the original subject under discussion  
> was the
> relations between each address space (equivalent, disjoint, subset/ 
> superset).
> Any thoughts on that? Should they also belong in TargetData?
Only if absolutely required, see below.
>>> I'm not completely sure how a bitcast from one pointer type to
>>> another
>>> would work then, it would probably need zext in addition to
casting?
>>
>> In the code generator, it can know a lot more information about what
>> happens when bitcasting from one address space to another.  This
>> doesn't have to go into TargetData.
> Well, if the pointers really are differently sized, you couldn't  
> just bitcast
> between them, since bitcast requires values of equal bitlength? Or  
> is this
> something that would only be exposed when doing intoptr/ptrtoint and  
> simply
> assumed to be handled in the backend for pointer-to-pointer bitcasts?
I think it should be safe for instcombine to assume there is no  
truncation of pointers.  If you have ptra -> ptrb -> ptra, it is  
theoretically possible that ptrb is smaller than ptra pointers.   
However, this is a form of pointer overflow, which llvm treats as  
undefined already.  Further, it isn't clear to me that this is going  
to be common enough for instcombine to really worry about.  Only  
handling it in the code generator seems fine to me, but maybe you have  
a different use case for addrspaces in mind.
>>> In this case, finding some text representation for the address
space
>>> relations make sense, so that these relations can be embedded into
>>> the
>>> LLVM IR  (possibly inside the "target datalayout" entry,
or
>>> perhaps in a
>>> separate "target addrspaces" entry?
>>
>> Are you envisioning an LLVM IR pass that mucks around with these?
>> When would that be useful?
> Not a particular pass, but passes in general. Specifically, I would  
> like
> instcombining to be able to use this info to remove useless  
> bitcasts. Also,
> this information is useful for clang to know when inserting an  
> implicit cast
> makes sense and when it would be an error.
Clang should just reject implicit casts in *any* case when the addr  
spaces differ IMO.  In any case, clang has it's own parameterization  
of the target which is richer than TargetData already, so if it really  
wanted to allow implicit casts between address space subsets, it could  
do so based on its own info.
>> You can see this in Linker::LinkModules, basically one randomly   
>> overrides
>> the other unless one modules is empty, in which case the non- empty  
>> one
>> wins.
> Shouldn't this give a warning instead of silently picking one?
What would the warning say?  There is nothing useful the linker could  
tell the user, and having a low level tool like the llvm linker  
library emit warnings (where? to stderr?) is extremely impolite for  
some clients.  If a specific client cares about this, it should check  
and report in its own way.

-Chris

Matthijs Kooijman

2008-Aug-06 16:24 UTC

head link

[LLVMdev] Casting between address spaces and address space semantics

Hi all,

I've been a tad busy in the last few weeks, but I don't think this issue
is
settled yet.
> > the relations between each address space (equivalent, disjoint,
subset/
> > superset).
> > Any thoughts on that? Should they also belong in TargetData?
> Only if absolutely required, see below.Is there any other place they would fit in better? What piece of below were
you referring to exactly? Would it be useful to have a class that works just
like TargetData does? Ie, an immutable pass that can be added to the
passmanager in addition to TargetData? Anyway, let's see what the result of
the below discussion is first, I guess.
> > Specifically, I would  like instcombining to be able to use this info
to
> > remove useless  bitcasts. Also, this information is useful for clang
to
> > know when inserting an  implicit cast makes sense and when it would be
an
> > error.
> Clang should just reject implicit casts in *any* case when the addr  
> spaces differ IMO.  In any case, clang has it's own parameterization  
> of the target which is richer than TargetData already, so if it really  
> wanted to allow implicit casts between address space subsets, it could  
> do so based on its own info.I think that adding this info to Clang's targetinfo code would indeed make
sense, though perhaps disallowing implicit casts entirely is also acceptable
(provided that you can still do explicit casts).

However, what I find more important (this discussion is greatly helping me in
finding out what exactly it is that I find important :-p) is that instcombine
(or some other pass, for all I care) can remove bitcasts that are not strictly
needed.

I will focus on address spaces that are nested, since that is the most
interesting and general case. Say we have address spaces SubA, SubB and Super,
and
SubA and SubB are subsets of Super (but disjoint to each other).

When is bitcast between address spaces really needed? I can see two main
reasons.
 1) A function gets passed pointers into both SubA and SubB (which then need
    to be bitcasted to pointer-into-Super).
 2) A pointer variable is assigned pointers into both SubA and SubB. In LLVM
    this would probably mean a phi node that joins pointers into different
		address spaces (which is not possible, hence a bitcast is needed and the
		phi node would be of type pointer-into-super).

However, after some transformations (mainly code duplication AFAICS), the
above situations might get removed, making the bitcasts redundant. The patch I
attached a few posts back implements this by removing the bitcast when
possible. This does mean that any users of the bitcast now have one of their
operands changed (the address space gets changed).

Take for example a load instruction, that directly uses a bitcasted pointer.

When removing the bitcast, the load instruction can be replaced by a new one
that takes a pointer into another (smaller) address space. The result of the
load instruction will still be the same value, but a bitcast was removed (and
the load instruction was possibly improved as well).

Let's look at two use cases I can see for nested address spaces.

 1) Different address spaces translate to different accessing instructions at
    code generation time. Instructions that access smaller address spaces
		(SubA, SubB) can execute faster than instructions that access the Super
		address space. Because of this, removing unneeded bitcasts to Super (and
		then replacing, for example, load instructions with their SubA or SubB
		variants) can result in faster code. This is currently not our use case,
		but depending on where our hardware design goes might become so.
 2) Having the Super address space as an address space that is usable in the
		input C code, but should be optimized away before it hits the backend.
		Removing those redundant bitcasts allows us to prevent a lot of code
		duplication or ugly macro use in our C code, while still ending up with
		code our backend can compile. This is our current use case.

For both these use cases, it would be needed to remove redundant bitcasts. To
be able to do that, the transformation passes (or at least the one the will do
this) needs to know the relations between the different address spaces.

At the very least, redundant bitcasts should be removed. It hacked up a patch
to make instcombine do this, but that only makes sense if removing those
bitcasts allows instcombine to do further optimizations, which again allows
bitcasts to be removed. I have the feeling that propagating address spaces
might be quite orthogonal to the other transformations instcombine does, so a
seperate pass for just this might be better.

I think the main question here is, are there any other transformations that
would need (or benefit from) information on address space relations? If not,
then it might be sufficient to pass an "AddressSpacesDescriptor" to
just that
one pass.

If there are other transformations, it would be useful to have one central
place to store this information (like TargetData now).

I've only just realized that this is an essential question, so I haven't
given
it too much thought yet.


In either case, we should wonder in what way this information is given to the
tools (opt/llc). If we make it dependent on the target architecture, this only
becomes an issue once we support an architecture where this is relevant.
Alternatively, we could have some way of specifying the relations on the
commandline (either directly or by choosing between a number of models). Which
of these is chosen (if any) would not be too relevant for us, since we have
our own compiler driver which can pass anything to the (instcombine or other)
pass in a hardcoded fashion.

Gr.

Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/c598c4a9/attachment.sig>

Matthijs Kooijman

2008-Aug-06 16:25 UTC

head link

[LLVMdev] Linking modules with different targetdata (Was: Casting between address spaces and address space) semantics

Hi Chris,

I've put this into a seperate reply, since it is really a seperate issue.
> >> You can see this in Linker::LinkModules, basically one randomly
> >> overrides the other unless one modules is empty, in which case the
non-
> >> empty  one wins.
> > Shouldn't this give a warning instead of silently picking one?
> What would the warning say?  There is nothing useful the linker could  
> tell the user, and having a low level tool like the llvm linker  
> library emit warnings (where? to stderr?) is extremely impolite for  
> some clients.  If a specific client cares about this, it should check  
> and report in its own way.
The linker already reports errors through an ErrorMsg output argument on
LinkInModule, though looking at it now, the other methods of llvm::Linker
don't have any way of outputting things. Since this wouldn't really be
an
error but a warning, doing this in Linker is probably too much fuss anyway.
Which would change my point slightly: Perhaps llvm-link and/or llmv-ld should
check the targetdata's and give a warning if they don't match?

Gr.

Matthijs
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/fda3a421/attachment.sig>

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Jul 2008 - [LLVMdev] Casting between address spaces and address space semantics

[LLVMdev] Casting between address spaces and address space semantics

[LLVMdev] Casting between address spaces and address space semantics

[LLVMdev] Casting between address spaces and address space semantics

[LLVMdev] Linking modules with different targetdata (Was: Casting between address spaces and address space) semantics

Possibly Parallel Threads