thr3ads.net - llvm dev - [LLVMdev] Pointer vs. integer: backend troubles [May 2009]

If this information is useful, please help other people find it:
Share via:

Christoph Erhardt

2009-May-06 08:58 UTC

[LLVMdev] Pointer vs. integer: backend troubles

Hi everyone,

I am currently working on a backend for the TriCore architecture.
Unfortunately, I have hit an issue with LLVM's internal representation
that's giving me a bit of a headache.

The problem is that LLVM assumes that a pointer is equivalent to a
machine-word sized integer. This implies that all pointer arithmetic
takes place in the CPU's general-purpose registers and is done with the
"regular" integer instructions.
Unfortunately, this does not hold true for the TriCore architecture,
which strictly differentiates between "normal" integer values and
pointer values. The register set is split into two subsets: 16
general-purpose registers %d0..%d15 for 32-bit integers and floats, and
16 address registers %a0..%a15 for 32-bit pointers, with separate
instructions. Moreover, the ABI requires that pointer arguments to (and
pointer results from) functions be passed in address registers instead
of general-purpose registers.
As LLVM internally converts all pointers to integers (in my case i32),
there is no way for a backend to tell whether an i32 operand is really
an integer or actually a pointer. Thus neither the instruction selection
nor the CallingConvention stuff works for me as expected.

It does not seem possible to solve this problem without modifying at
least some of the original LLVM source code. So what would be the
easiest (and least invasive) way to achieve this?
I have thought about adding a new ValueType (say, "p32") and
overriding
TargetLowering::getPointerTy() to return that new type instead of i32.
Of course, this would probably be more of a dirty hack than an actual
solution, but hopefully would do the trick - provided I'm not missing
something...

Comments and suggestions are highly welcome.
Thank you for your time!

Christoph

ether zhhb

2009-May-06 12:25 UTC

head link

[LLVMdev] Pointer vs. integer: backend troubles

what about adding some annotations to the i32 Value which representing
a pointer or annotate the "i32" type of that pointer?

Dan Gohman

2009-May-06 16:18 UTC

head link

[LLVMdev] Pointer vs. integer: backend troubles

On May 6, 2009, at 1:58 AM, Christoph Erhardt wrote:
> Hi everyone,
>
> I am currently working on a backend for the TriCore architecture.
> Unfortunately, I have hit an issue with LLVM's internal representation
> that's giving me a bit of a headache.
>
> The problem is that LLVM assumes that a pointer is equivalent to a
> machine-word sized integer. This implies that all pointer arithmetic
> takes place in the CPU's general-purpose registers and is done with  
> the
> "regular" integer instructions.
> Unfortunately, this does not hold true for the TriCore architecture,
> which strictly differentiates between "normal" integer values and
> pointer values. The register set is split into two subsets: 16
> general-purpose registers %d0..%d15 for 32-bit integers and floats,  
> and
> 16 address registers %a0..%a15 for 32-bit pointers, with separate
> instructions. Moreover, the ABI requires that pointer arguments to  
> (and
> pointer results from) functions be passed in address registers instead
> of general-purpose registers.
>
> As LLVM internally converts all pointers to integers (in my case i32),
> there is no way for a backend to tell whether an i32 operand is really
> an integer or actually a pointer. Thus neither the instruction  
> selection
> nor the CallingConvention stuff works for me as expected.
Your architecture poses some significant challenges for LLVM. Tackling
them sounds possible, though it'll take some work.

I'm working on a patch which changes the way function arguments and
return values are lowered:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-April/021908.html
(To everyone who gave me feedback, thanks! I'm working on an updated
patch.)

The current patch doesn't solve your problem immediately, but one of the
suggestions I got was that the argument and return value records should
carry information about what the original type of the value was. This is
needed for example on targets where i64 is not a legal type, but it  
still
needs to be passed differently from two i32 values. I was originally
thinking of just including the original MVT type, but it could be  
changed
to the LLVM IR Type*, to provide even more information. I hope to find
time to post an updated version of this patch soon, though I don't
know if it'll go into LLVM 2.6 or if it'll wait for 2.7.

Beyond the ABI requirements, LLVM treats pointers and integers fairly
interchangeably in the optimizer as well as codegen. This isn't
specific to LLVM either; there are a lot of cases where integer
arithmetic is used to perfom an index calculation, so the decision of
which instructions to use depends on the context. I've seen other
compilers make these decisions around register allocation time with
a fair amount of success. This is an area which LLVM hasn't
explored much, though I know there are a few people on this list
who are working on targets with similar requirements.
>
>
> It does not seem possible to solve this problem without modifying at
> least some of the original LLVM source code. So what would be the
> easiest (and least invasive) way to achieve this?
FWIW, everyone I know working on backends that care about quality of
generated code ends up needing to do work in target-independent code.

Dan

Jakob Stoklund Olesen

2009-May-06 17:47 UTC

head link

[LLVMdev] Pointer vs. integer: backend troubles

On 06/05/2009, at 10.58, Christoph Erhardt wrote:
[...]> Unfortunately, this does not hold true for the TriCore architecture,
> which strictly differentiates between "normal" integer values and
> pointer values. The register set is split into two subsets: 16
> general-purpose registers %d0..%d15 for 32-bit integers and floats,  
> and
> 16 address registers %a0..%a15 for 32-bit pointers, with separate
> instructions.
Hi Christoph,

I am working on a back end for the blackfin DSP. It also has two  
register sets: data and pointers, so the architectures are similar. I  
am using a lot of register classes to represent the many instruction  
constraints. The code generator support for weird register classes has  
improved a lot recently.

The instruction selector does not know about register classes - it  
only uses value types when pattern matching. That is probably a good  
idea; it can be hard to tell what is a pointer and what is an integer  
beforehand.

After instruction selection is complete, each virtual register is  
assigned to a register class (A or D, say). Currently, the defining  
instruction determines the register class. Sometimes it is necessary  
to insert extra copy instructions before instructions that require an  
incompatible register class. Enabling -join-cross-class-copies will  
clean up a lot of these copies afterwards.

I have a patch in my own tree that will do register class inference  
instead: It tries to choose a register class for a virtual register  
based on all interested instructions, rather than just the defining  
one. This causes less copies to be emitted only to be removed later.  
The end result is similar, so it is mostly an optimisation.

There are some unsolved problems:

1. Virtual registers from PHI nodes are created before instruction  
selection, so there is no way of giving them a proper register class.  
Instead, TargetLowering::getRegClassFor() is used, fingers crossed. It  
may be that -join-cross-class-copies is able to clean up here too. I  
have not tested that. Ideally, I would like  
TargetLowering::getRegClassFor() to go away entirely.

2. There is no way of using alternative instructions with different  
operand classes. For instance, TriCore can subtract two D registers  
(SUB), or two A registers (SUB.A). You can only have one pattern for  
i32 subtraction, so the SUB.A instruction will not be used. We need a  
way of replacing SUB with SUB.A when it makes sense. Calculating when  
it makes sense is the hard part.

You can produce correct code without solving these problems - you will  
just have some extra copies between A and D registers.
> Moreover, the ABI requires that pointer arguments to (and
> pointer results from) functions be passed in address registers instead
> of general-purpose registers.
You should treat this as a separate issue from register allocation.  
The ABI requirements must be followed exactly, and you need some kind  
of annotation as Dan described. For register allocation you can be  
less exact, and that can be an advantage. Pointer and integer  
arithmetic is sometimes mixed, and it can be an advantage to keep an  
integer in a pointer register and vice versa.

HTH
/jakob

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - May 2009 - [LLVMdev] Pointer vs. integer: backend troubles

[LLVMdev] Pointer vs. integer: backend troubles

[LLVMdev] Pointer vs. integer: backend troubles

[LLVMdev] Pointer vs. integer: backend troubles

[LLVMdev] Pointer vs. integer: backend troubles

Apparently Analagous Threads