Matthijs Kooijman
2008-Aug-06 16:24 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi all, I've been a tad busy in the last few weeks, but I don't think this issue is settled yet.> > the relations between each address space (equivalent, disjoint, subset/ > > superset). > > Any thoughts on that? Should they also belong in TargetData? > Only if absolutely required, see below.Is there any other place they would fit in better? What piece of below were you referring to exactly? Would it be useful to have a class that works just like TargetData does? Ie, an immutable pass that can be added to the passmanager in addition to TargetData? Anyway, let's see what the result of the below discussion is first, I guess.> > Specifically, I would like instcombining to be able to use this info to > > remove useless bitcasts. Also, this information is useful for clang to > > know when inserting an implicit cast makes sense and when it would be an > > error. > Clang should just reject implicit casts in *any* case when the addr > spaces differ IMO. In any case, clang has it's own parameterization > of the target which is richer than TargetData already, so if it really > wanted to allow implicit casts between address space subsets, it could > do so based on its own info.I think that adding this info to Clang's targetinfo code would indeed make sense, though perhaps disallowing implicit casts entirely is also acceptable (provided that you can still do explicit casts). However, what I find more important (this discussion is greatly helping me in finding out what exactly it is that I find important :-p) is that instcombine (or some other pass, for all I care) can remove bitcasts that are not strictly needed. I will focus on address spaces that are nested, since that is the most interesting and general case. Say we have address spaces SubA, SubB and Super, and SubA and SubB are subsets of Super (but disjoint to each other). When is bitcast between address spaces really needed? I can see two main reasons. 1) A function gets passed pointers into both SubA and SubB (which then need to be bitcasted to pointer-into-Super). 2) A pointer variable is assigned pointers into both SubA and SubB. In LLVM this would probably mean a phi node that joins pointers into different address spaces (which is not possible, hence a bitcast is needed and the phi node would be of type pointer-into-super). However, after some transformations (mainly code duplication AFAICS), the above situations might get removed, making the bitcasts redundant. The patch I attached a few posts back implements this by removing the bitcast when possible. This does mean that any users of the bitcast now have one of their operands changed (the address space gets changed). Take for example a load instruction, that directly uses a bitcasted pointer. When removing the bitcast, the load instruction can be replaced by a new one that takes a pointer into another (smaller) address space. The result of the load instruction will still be the same value, but a bitcast was removed (and the load instruction was possibly improved as well). Let's look at two use cases I can see for nested address spaces. 1) Different address spaces translate to different accessing instructions at code generation time. Instructions that access smaller address spaces (SubA, SubB) can execute faster than instructions that access the Super address space. Because of this, removing unneeded bitcasts to Super (and then replacing, for example, load instructions with their SubA or SubB variants) can result in faster code. This is currently not our use case, but depending on where our hardware design goes might become so. 2) Having the Super address space as an address space that is usable in the input C code, but should be optimized away before it hits the backend. Removing those redundant bitcasts allows us to prevent a lot of code duplication or ugly macro use in our C code, while still ending up with code our backend can compile. This is our current use case. For both these use cases, it would be needed to remove redundant bitcasts. To be able to do that, the transformation passes (or at least the one the will do this) needs to know the relations between the different address spaces. At the very least, redundant bitcasts should be removed. It hacked up a patch to make instcombine do this, but that only makes sense if removing those bitcasts allows instcombine to do further optimizations, which again allows bitcasts to be removed. I have the feeling that propagating address spaces might be quite orthogonal to the other transformations instcombine does, so a seperate pass for just this might be better. I think the main question here is, are there any other transformations that would need (or benefit from) information on address space relations? If not, then it might be sufficient to pass an "AddressSpacesDescriptor" to just that one pass. If there are other transformations, it would be useful to have one central place to store this information (like TargetData now). I've only just realized that this is an essential question, so I haven't given it too much thought yet. In either case, we should wonder in what way this information is given to the tools (opt/llc). If we make it dependent on the target architecture, this only becomes an issue once we support an architecture where this is relevant. Alternatively, we could have some way of specifying the relations on the commandline (either directly or by choosing between a number of models). Which of these is chosen (if any) would not be too relevant for us, since we have our own compiler driver which can pass anything to the (instcombine or other) pass in a hardcoded fashion. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/c598c4a9/attachment.sig>
Mon P Wang
2008-Aug-07 06:16 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi Matthijs,> >>> Specifically, I would like instcombining to be able to use this >>> info to >>> remove useless bitcasts. Also, this information is useful for >>> clang to >>> know when inserting an implicit cast makes sense and when it >>> would be an >>> error. >> Clang should just reject implicit casts in *any* case when the addr >> spaces differ IMO. In any case, clang has it's own parameterization >> of the target which is richer than TargetData already, so if it >> really >> wanted to allow implicit casts between address space subsets, it >> could >> do so based on its own info. > I think that adding this info to Clang's targetinfo code would > indeed make > sense, though perhaps disallowing implicit casts entirely is also > acceptable > (provided that you can still do explicit casts). >Like most other type qualifier, explicit casts are allowed. However, if some one uses it in a context which is illegal, the result is undefined (e.g.,., cast a pointer from one space to another one address and then uses it where the address spaces don't overlap). The C99 extension indicates that a cast of a pointer to one address space to another is valid only if the object that pointer points to exists in both address spaces (otherwise, the result is undefined). I think we all agree that the default behavior in clang should reject implicit casts between address spaces unless clang has some additional information to know better.> However, what I find more important (this discussion is greatly > helping me in > finding out what exactly it is that I find important :-p) is that > instcombine > (or some other pass, for all I care) can remove bitcasts that are > not strictly > needed. > > I will focus on address spaces that are nested, since that is the most > interesting and general case. Say we have address spaces SubA, SubB > and Super, and > SubA and SubB are subsets of Super (but disjoint to each other). > > When is bitcast between address spaces really needed? I can see two > main > reasons. > 1) A function gets passed pointers into both SubA and SubB (which > then need > to be bitcasted to pointer-into-Super). > 2) A pointer variable is assigned pointers into both SubA and SubB. > In LLVM > this would probably mean a phi node that joins pointers into > different > address spaces (which is not possible, hence a bitcast is needed > and the > phi node would be of type pointer-into-super). > > However, after some transformations (mainly code duplication > AFAICS), the > above situations might get removed, making the bitcasts redundant. > The patch I > attached a few posts back implements this by removing the bitcast when > possible. This does mean that any users of the bitcast now have one > of their > operands changed (the address space gets changed). > > Take for example a load instruction, that directly uses a bitcasted > pointer. > > When removing the bitcast, the load instruction can be replaced by a > new one > that takes a pointer into another (smaller) address space. The > result of the > load instruction will still be the same value, but a bitcast was > removed (and > the load instruction was possibly improved as well). > > Let's look at two use cases I can see for nested address spaces. > > 1) Different address spaces translate to different accessing > instructions at > code generation time. Instructions that access smaller address > spaces > (SubA, SubB) can execute faster than instructions that access the > Super > address space. Because of this, removing unneeded bitcasts to > Super (and > then replacing, for example, load instructions with their SubA or > SubB > variants) can result in faster code. This is currently not our use > case, > but depending on where our hardware design goes might become so.> 2) Having the Super address space as an address space that is usable > in the > input C code, but should be optimized away before it hits the > backend. > Removing those redundant bitcasts allows us to prevent a lot of code > duplication or ugly macro use in our C code, while still ending up > with > code our backend can compile. This is our current use case. > > For both these use cases, it would be needed to remove redundant > bitcasts. To > be able to do that, the transformation passes (or at least the one > the will do > this) needs to know the relations between the different address > spaces. >I would imagine that such a cleanup phase will probably need to track what objects a pointer can point to in some region of code so it can either remove redundant casts or add a cast to refine the information of what a pointer can point to.> At the very least, redundant bitcasts should be removed. It hacked > up a patch > to make instcombine do this, but that only makes sense if removing > those > bitcasts allows instcombine to do further optimizations, which again > allows > bitcasts to be removed. I have the feeling that propagating address > spaces > might be quite orthogonal to the other transformations instcombine > does, so a > seperate pass for just this might be better. > > I think the main question here is, are there any other > transformations that > would need (or benefit from) information on address space relations? > If not, > then it might be sufficient to pass an "AddressSpacesDescriptor" to > just that > one pass.Alias analysis could use the information to help determine if two pointer aliases. As you indicated, a cleanup phases could use the information. I don't think most other phases care as what they want to know is what address space a pointer points to and not the relationship about the address spaces.> If there are other transformations, it would be useful to have one > central > place to store this information (like TargetData now). > > I've only just realized that this is an essential question, so I > haven't given > it too much thought yet. > > > In either case, we should wonder in what way this information is > given to the > tools (opt/llc). If we make it dependent on the target architecture, > this only > becomes an issue once we support an architecture where this is > relevant. > Alternatively, we could have some way of specifying the relations on > the > commandline (either directly or by choosing between a number of > models). Which > of these is chosen (if any) would not be too relevant for us, since > we have > our own compiler driver which can pass anything to the (instcombine > or other) > pass in a hardcoded fashion.IMO, I would think that address space relationship is based on the target architecture so it seems most natural to associate with the target. Regards, -- Mon Ping -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/d7d22bce/attachment.html>
Matthijs Kooijman
2008-Aug-07 14:41 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi Mon Ping, I've again attached a patch, wich lets LLVM know about about the relations between different address spaces. Instead of cramming this info in with TargetData (with all kinds of unwanted side effects, especially for the IR), I opted to create a new pass, TargetAddrspaces, which holds this information. Just like TargetData, this can be added to the passmanager by the tool running the passes. Unlike TargetData, however, I wanted to be able to subclass TargetAddrspaces to change the behaviour. To make this possible, I made TargetAddrspaces an analysis group, with a default implementation AllDisjointAddrspaces (which, as the name suggests, returns Disjoint for all combinations). Having a default implementations is useful, so tools are not required to add a TargetAddrspaces pass to a passmanager. The last part of this patch is an addition to InstCombine to make use of this information: It removes any bitcasts from a subset to a superset address space. It gets at the address space information by requiring the TargetAddrspaces analysis, which will give it the default implementation in all current tools. So, this requires a minimal amount of changes, the current code won't know the difference, and in our custom frontend we provide another implementation of TargetAddrspaces and things work very well. There are a number of issues still, though. I'm not 100% sure that using an Analysis pass for this is correct, but currently only analysis passes can be grouped (though I have the suspicion that this is mainly a question of naming, really). Additionally, the "Writing an LLVM Pass" document states: "There must be exactly one default implementation available at all times for an Analysis Group to be used. Only default implementation can derive from ImmutablePass." I'm not completely sure why there cannot be other implementations that derive from ImmutablePass. In this case, I want to have any implementation of TargetAddrspaces also derive from ImmutablePass, because it makes sense. In practice, this works as well, our custom implementeation as well as the default implementation derive from ImmutablePass and everything compiles and runs fine. Is this perhaps an outdated statement? Mon Ping suggests using address space information for alias analysis as well, which seems to make sense. In effect this is a form of type-based alias analysis, but different address spaces don't preclude pointers from being equal. A problem here is that pointers in disjoint address spaces would be marked as not aliasing, but when the default relation is disjoint this is not so conservative. This might require an extra option "Unknown", which can be used as the default instead of "Disjoint". For Unknown, any pass can do the conservative thing. Lastly, I'm still not so sure if InstCombine is the right place for this simplification. This needs some more thought, but currently it is a problem that instcombine does not process BitCastConstantExprs. I might end up writing a seperate pass for just this. So again, suggestions welcome. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: addrspaces.diff Type: text/x-diff Size: 9792 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080807/b63e2bde/attachment.diff> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080807/b63e2bde/attachment.sig>
Reasonably Related Threads
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics