Hi everyone, I am currently working on a backend for the TriCore architecture. Unfortunately, I have hit an issue with LLVM's internal representation that's giving me a bit of a headache. The problem is that LLVM assumes that a pointer is equivalent to a machine-word sized integer. This implies that all pointer arithmetic takes place in the CPU's general-purpose registers and is done with the "regular" integer instructions. Unfortunately, this does not hold true for the TriCore architecture, which strictly differentiates between "normal" integer values and pointer values. The register set is split into two subsets: 16 general-purpose registers %d0..%d15 for 32-bit integers and floats, and 16 address registers %a0..%a15 for 32-bit pointers, with separate instructions. Moreover, the ABI requires that pointer arguments to (and pointer results from) functions be passed in address registers instead of general-purpose registers. As LLVM internally converts all pointers to integers (in my case i32), there is no way for a backend to tell whether an i32 operand is really an integer or actually a pointer. Thus neither the instruction selection nor the CallingConvention stuff works for me as expected. It does not seem possible to solve this problem without modifying at least some of the original LLVM source code. So what would be the easiest (and least invasive) way to achieve this? I have thought about adding a new ValueType (say, "p32") and overriding TargetLowering::getPointerTy() to return that new type instead of i32. Of course, this would probably be more of a dirty hack than an actual solution, but hopefully would do the trick - provided I'm not missing something... Comments and suggestions are highly welcome. Thank you for your time! Christoph
what about adding some annotations to the i32 Value which representing a pointer or annotate the "i32" type of that pointer?
On May 6, 2009, at 1:58 AM, Christoph Erhardt wrote:> Hi everyone, > > I am currently working on a backend for the TriCore architecture. > Unfortunately, I have hit an issue with LLVM's internal representation > that's giving me a bit of a headache. > > The problem is that LLVM assumes that a pointer is equivalent to a > machine-word sized integer. This implies that all pointer arithmetic > takes place in the CPU's general-purpose registers and is done with > the > "regular" integer instructions. > Unfortunately, this does not hold true for the TriCore architecture, > which strictly differentiates between "normal" integer values and > pointer values. The register set is split into two subsets: 16 > general-purpose registers %d0..%d15 for 32-bit integers and floats, > and > 16 address registers %a0..%a15 for 32-bit pointers, with separate > instructions. Moreover, the ABI requires that pointer arguments to > (and > pointer results from) functions be passed in address registers instead > of general-purpose registers. > > As LLVM internally converts all pointers to integers (in my case i32), > there is no way for a backend to tell whether an i32 operand is really > an integer or actually a pointer. Thus neither the instruction > selection > nor the CallingConvention stuff works for me as expected.Your architecture poses some significant challenges for LLVM. Tackling them sounds possible, though it'll take some work. I'm working on a patch which changes the way function arguments and return values are lowered: http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-April/021908.html (To everyone who gave me feedback, thanks! I'm working on an updated patch.) The current patch doesn't solve your problem immediately, but one of the suggestions I got was that the argument and return value records should carry information about what the original type of the value was. This is needed for example on targets where i64 is not a legal type, but it still needs to be passed differently from two i32 values. I was originally thinking of just including the original MVT type, but it could be changed to the LLVM IR Type*, to provide even more information. I hope to find time to post an updated version of this patch soon, though I don't know if it'll go into LLVM 2.6 or if it'll wait for 2.7. Beyond the ABI requirements, LLVM treats pointers and integers fairly interchangeably in the optimizer as well as codegen. This isn't specific to LLVM either; there are a lot of cases where integer arithmetic is used to perfom an index calculation, so the decision of which instructions to use depends on the context. I've seen other compilers make these decisions around register allocation time with a fair amount of success. This is an area which LLVM hasn't explored much, though I know there are a few people on this list who are working on targets with similar requirements.> > > It does not seem possible to solve this problem without modifying at > least some of the original LLVM source code. So what would be the > easiest (and least invasive) way to achieve this?FWIW, everyone I know working on backends that care about quality of generated code ends up needing to do work in target-independent code. Dan
Jakob Stoklund Olesen
2009-May-06 17:47 UTC
[LLVMdev] Pointer vs. integer: backend troubles
On 06/05/2009, at 10.58, Christoph Erhardt wrote: [...]> Unfortunately, this does not hold true for the TriCore architecture, > which strictly differentiates between "normal" integer values and > pointer values. The register set is split into two subsets: 16 > general-purpose registers %d0..%d15 for 32-bit integers and floats, > and > 16 address registers %a0..%a15 for 32-bit pointers, with separate > instructions.Hi Christoph, I am working on a back end for the blackfin DSP. It also has two register sets: data and pointers, so the architectures are similar. I am using a lot of register classes to represent the many instruction constraints. The code generator support for weird register classes has improved a lot recently. The instruction selector does not know about register classes - it only uses value types when pattern matching. That is probably a good idea; it can be hard to tell what is a pointer and what is an integer beforehand. After instruction selection is complete, each virtual register is assigned to a register class (A or D, say). Currently, the defining instruction determines the register class. Sometimes it is necessary to insert extra copy instructions before instructions that require an incompatible register class. Enabling -join-cross-class-copies will clean up a lot of these copies afterwards. I have a patch in my own tree that will do register class inference instead: It tries to choose a register class for a virtual register based on all interested instructions, rather than just the defining one. This causes less copies to be emitted only to be removed later. The end result is similar, so it is mostly an optimisation. There are some unsolved problems: 1. Virtual registers from PHI nodes are created before instruction selection, so there is no way of giving them a proper register class. Instead, TargetLowering::getRegClassFor() is used, fingers crossed. It may be that -join-cross-class-copies is able to clean up here too. I have not tested that. Ideally, I would like TargetLowering::getRegClassFor() to go away entirely. 2. There is no way of using alternative instructions with different operand classes. For instance, TriCore can subtract two D registers (SUB), or two A registers (SUB.A). You can only have one pattern for i32 subtraction, so the SUB.A instruction will not be used. We need a way of replacing SUB with SUB.A when it makes sense. Calculating when it makes sense is the hard part. You can produce correct code without solving these problems - you will just have some extra copies between A and D registers.> Moreover, the ABI requires that pointer arguments to (and > pointer results from) functions be passed in address registers instead > of general-purpose registers.You should treat this as a separate issue from register allocation. The ABI requirements must be followed exactly, and you need some kind of annotation as Dan described. For register allocation you can be less exact, and that can be an advantage. Pointer and integer arithmetic is sometimes mixed, and it can be an advantage to keep an integer in a pointer register and vice versa. HTH /jakob
Seemingly Similar Threads
- [LLVMdev] Passing return values on the stack & storing arbitrary sized integers
- [LLVMdev] Passing return values on the stack & storing arbitrary sized integers
- [LLVMdev] Passing return values on the stack & storing arbitrary sized integers
- [LLVMdev] Passing return values on the stack & storing arbitrary sized integers
- [LLVMdev] Passing return values on the stack & storing arbitrary sized integers