Matthijs Kooijman
2008-Jul-22 08:36 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi Chris,> >> You probably want to somehow extend TargetData to encode the address > >> space descriptions, including pointer sizes and address space > >> relationships. > > Hmm, you have an excellent point there, pointer sizes could indeed easily > > have different sizes in different address spaces... That would make > > TargetData indeed a logical place to put such info. > > Yep, I think it makes sense for TargetData to have info about the size/ > alignment of a pointer in each addr space. That is also easy to encode.Which is an added bonus, but the original subject under discussion was the relations between each address space (equivalent, disjoint, subset/superset). Any thoughts on that? Should they also belong in TargetData?> > I'm not completely sure how a bitcast from one pointer type to another > > would work then, it would probably need zext in addition to casting? > > In the code generator, it can know a lot more information about what > happens when bitcasting from one address space to another. This > doesn't have to go into TargetData.Well, if the pointers really are differently sized, you couldn't just bitcast between them, since bitcast requires values of equal bitlength? Or is this something that would only be exposed when doing intoptr/ptrtoint and simply assumed to be handled in the backend for pointer-to-pointer bitcasts?> > In this case, finding some text representation for the address space > > relations make sense, so that these relations can be embedded into the > > LLVM IR (possibly inside the "target datalayout" entry, or perhaps in a > > separate "target addrspaces" entry? > > Are you envisioning an LLVM IR pass that mucks around with these? > When would that be useful?Not a particular pass, but passes in general. Specifically, I would like instcombining to be able to use this info to remove useless bitcasts. Also, this information is useful for clang to know when inserting an implicit cast makes sense and when it would be an error.> You can see this in Linker::LinkModules, basically one randomly overrides > the other unless one modules is empty, in which case the non- empty one > wins.Shouldn't this give a warning instead of silently picking one? Also, couldn't this produces an invalid module? Say, for example that one module has 16 bit pointers and the other 32 bit. If we use 32 bit as the resulting pointer size, but the other module contained something like %int = ptrtoint i32* %A to i16 This could suddenly cause truncation of the pointer, where it didn't in the original module. Warning for this seems appropriate? The same would go for different address space configurations, I expect mixing those could cause nasty, but probabl very subtle bugs... Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080722/abc5e128/attachment.sig>
Chris Lattner
2008-Jul-22 17:12 UTC
[LLVMdev] Casting between address spaces and address space semantics
On Jul 22, 2008, at 1:36 AM, Matthijs Kooijman wrote:>> Yep, I think it makes sense for TargetData to have info about the >> size/ >> alignment of a pointer in each addr space. That is also easy to >> encode. > Which is an added bonus, but the original subject under discussion > was the > relations between each address space (equivalent, disjoint, subset/ > superset). > Any thoughts on that? Should they also belong in TargetData?Only if absolutely required, see below.>>> I'm not completely sure how a bitcast from one pointer type to >>> another >>> would work then, it would probably need zext in addition to casting? >> >> In the code generator, it can know a lot more information about what >> happens when bitcasting from one address space to another. This >> doesn't have to go into TargetData. > Well, if the pointers really are differently sized, you couldn't > just bitcast > between them, since bitcast requires values of equal bitlength? Or > is this > something that would only be exposed when doing intoptr/ptrtoint and > simply > assumed to be handled in the backend for pointer-to-pointer bitcasts?I think it should be safe for instcombine to assume there is no truncation of pointers. If you have ptra -> ptrb -> ptra, it is theoretically possible that ptrb is smaller than ptra pointers. However, this is a form of pointer overflow, which llvm treats as undefined already. Further, it isn't clear to me that this is going to be common enough for instcombine to really worry about. Only handling it in the code generator seems fine to me, but maybe you have a different use case for addrspaces in mind.>>> In this case, finding some text representation for the address space >>> relations make sense, so that these relations can be embedded into >>> the >>> LLVM IR (possibly inside the "target datalayout" entry, or >>> perhaps in a >>> separate "target addrspaces" entry? >> >> Are you envisioning an LLVM IR pass that mucks around with these? >> When would that be useful? > Not a particular pass, but passes in general. Specifically, I would > like > instcombining to be able to use this info to remove useless > bitcasts. Also, > this information is useful for clang to know when inserting an > implicit cast > makes sense and when it would be an error.Clang should just reject implicit casts in *any* case when the addr spaces differ IMO. In any case, clang has it's own parameterization of the target which is richer than TargetData already, so if it really wanted to allow implicit casts between address space subsets, it could do so based on its own info.>> You can see this in Linker::LinkModules, basically one randomly >> overrides >> the other unless one modules is empty, in which case the non- empty >> one >> wins. > Shouldn't this give a warning instead of silently picking one?What would the warning say? There is nothing useful the linker could tell the user, and having a low level tool like the llvm linker library emit warnings (where? to stderr?) is extremely impolite for some clients. If a specific client cares about this, it should check and report in its own way. -Chris
Matthijs Kooijman
2008-Aug-06 16:24 UTC
[LLVMdev] Casting between address spaces and address space semantics
Hi all, I've been a tad busy in the last few weeks, but I don't think this issue is settled yet.> > the relations between each address space (equivalent, disjoint, subset/ > > superset). > > Any thoughts on that? Should they also belong in TargetData? > Only if absolutely required, see below.Is there any other place they would fit in better? What piece of below were you referring to exactly? Would it be useful to have a class that works just like TargetData does? Ie, an immutable pass that can be added to the passmanager in addition to TargetData? Anyway, let's see what the result of the below discussion is first, I guess.> > Specifically, I would like instcombining to be able to use this info to > > remove useless bitcasts. Also, this information is useful for clang to > > know when inserting an implicit cast makes sense and when it would be an > > error. > Clang should just reject implicit casts in *any* case when the addr > spaces differ IMO. In any case, clang has it's own parameterization > of the target which is richer than TargetData already, so if it really > wanted to allow implicit casts between address space subsets, it could > do so based on its own info.I think that adding this info to Clang's targetinfo code would indeed make sense, though perhaps disallowing implicit casts entirely is also acceptable (provided that you can still do explicit casts). However, what I find more important (this discussion is greatly helping me in finding out what exactly it is that I find important :-p) is that instcombine (or some other pass, for all I care) can remove bitcasts that are not strictly needed. I will focus on address spaces that are nested, since that is the most interesting and general case. Say we have address spaces SubA, SubB and Super, and SubA and SubB are subsets of Super (but disjoint to each other). When is bitcast between address spaces really needed? I can see two main reasons. 1) A function gets passed pointers into both SubA and SubB (which then need to be bitcasted to pointer-into-Super). 2) A pointer variable is assigned pointers into both SubA and SubB. In LLVM this would probably mean a phi node that joins pointers into different address spaces (which is not possible, hence a bitcast is needed and the phi node would be of type pointer-into-super). However, after some transformations (mainly code duplication AFAICS), the above situations might get removed, making the bitcasts redundant. The patch I attached a few posts back implements this by removing the bitcast when possible. This does mean that any users of the bitcast now have one of their operands changed (the address space gets changed). Take for example a load instruction, that directly uses a bitcasted pointer. When removing the bitcast, the load instruction can be replaced by a new one that takes a pointer into another (smaller) address space. The result of the load instruction will still be the same value, but a bitcast was removed (and the load instruction was possibly improved as well). Let's look at two use cases I can see for nested address spaces. 1) Different address spaces translate to different accessing instructions at code generation time. Instructions that access smaller address spaces (SubA, SubB) can execute faster than instructions that access the Super address space. Because of this, removing unneeded bitcasts to Super (and then replacing, for example, load instructions with their SubA or SubB variants) can result in faster code. This is currently not our use case, but depending on where our hardware design goes might become so. 2) Having the Super address space as an address space that is usable in the input C code, but should be optimized away before it hits the backend. Removing those redundant bitcasts allows us to prevent a lot of code duplication or ugly macro use in our C code, while still ending up with code our backend can compile. This is our current use case. For both these use cases, it would be needed to remove redundant bitcasts. To be able to do that, the transformation passes (or at least the one the will do this) needs to know the relations between the different address spaces. At the very least, redundant bitcasts should be removed. It hacked up a patch to make instcombine do this, but that only makes sense if removing those bitcasts allows instcombine to do further optimizations, which again allows bitcasts to be removed. I have the feeling that propagating address spaces might be quite orthogonal to the other transformations instcombine does, so a seperate pass for just this might be better. I think the main question here is, are there any other transformations that would need (or benefit from) information on address space relations? If not, then it might be sufficient to pass an "AddressSpacesDescriptor" to just that one pass. If there are other transformations, it would be useful to have one central place to store this information (like TargetData now). I've only just realized that this is an essential question, so I haven't given it too much thought yet. In either case, we should wonder in what way this information is given to the tools (opt/llc). If we make it dependent on the target architecture, this only becomes an issue once we support an architecture where this is relevant. Alternatively, we could have some way of specifying the relations on the commandline (either directly or by choosing between a number of models). Which of these is chosen (if any) would not be too relevant for us, since we have our own compiler driver which can pass anything to the (instcombine or other) pass in a hardcoded fashion. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/c598c4a9/attachment.sig>
Matthijs Kooijman
2008-Aug-06 16:25 UTC
[LLVMdev] Linking modules with different targetdata (Was: Casting between address spaces and address space) semantics
Hi Chris, I've put this into a seperate reply, since it is really a seperate issue.> >> You can see this in Linker::LinkModules, basically one randomly > >> overrides the other unless one modules is empty, in which case the non- > >> empty one wins. > > Shouldn't this give a warning instead of silently picking one? > What would the warning say? There is nothing useful the linker could > tell the user, and having a low level tool like the llvm linker > library emit warnings (where? to stderr?) is extremely impolite for > some clients. If a specific client cares about this, it should check > and report in its own way.The linker already reports errors through an ErrorMsg output argument on LinkInModule, though looking at it now, the other methods of llvm::Linker don't have any way of outputting things. Since this wouldn't really be an error but a warning, doing this in Linker is probably too much fuss anyway. Which would change my point slightly: Perhaps llvm-link and/or llmv-ld should check the targetdata's and give a warning if they don't match? Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080806/fda3a421/attachment.sig>
Reasonably Related Threads
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics
- [LLVMdev] Casting between address spaces and address space semantics