thr3ads.net - llvm dev - [LLVMdev] Backend: 2 address + 17bit immediate [Mar 2007]

If this information is useful, please help other people find it:
Share via:

Andy Nisbet

2007-Mar-22 14:38 UTC

[LLVMdev] Backend: 2 address + 17bit immediate

Hello,
Im (trying) to write a backend for a simple 32bit processor architecture, 
with a single instruction format having no condition code registers.
www.docm.mmu.ac.uk/STAFF/A.Nisbet/Sabre.pdf  is the short 15 page document 
describing the architecture of Sabre. It is a Celoxica developed 
research/teaching processor, pages 5-8 contain relevant information for 
targetting it from a new compiler backend, i,e, it is trivially simple with 
25 actual instructions. Typo on page 5, operand A is clearly bits 9-5.

The general form for instructions is:--

opcode %a, %b, 17bit signed immediate.

%b is a source register.
%a is typically the source and the destination register for the operation, 
ie %a = operation %a,%b, immediate.
%b and the immediate act like a virtual operand c that is the sum 
of  register b's contents and the immediate value.
%b can be omitted if it refers to the "zero valued register %0".
The immediate can be omitted if it has a zero value.
The exceptions to this are the various forms of conditional branch 
instructions that must compare the contents of 2 registers and specify a 
branch target address using the immediate, (textually the immediate is a 
label, in machine code the immediate is a relative offset for the PC).


I have spent some time looking at the PPC and SPARC backends, but obviously 
these are much more complicated than what I require to implement. 
Consequently, I am not correctly grasping the interactions between 
ARCHInstrInfo.td and ARCHDAGToDAGISel.cpp I did manage to hack something 
together based on a copy of SPARC (with a SABRE namespace etc) but the 
instruction selection was incorrect and I obtained a "Cannot yet 
select:0x..." assertion failure from SABREDAGToDAGIsel::SelectCode when I 
attempted a
llc -march sabre helloworld.bc -o helloworld.s

Can anyone offer any guidance on how to proceed with debugging instruction 
selection issues? Or perhaps some description of how the pattern matching 
and the instruction selection works with a verbose explanation for a single 
instruction (this would probably be more beneficial), relating the 
Processor instruction set to the LLVM supported instruction set and the 
actual code generation/printing.


WRT defining the instructions themselves: am I right in thinking that it is 
sensible (for instruction selection) to represent the instruction set as a 
collection of instructions targetting register register and register 
immediate, so for example I would create defs for
ADDrr to match ADD %a,%b
ADDri to match ADD %a, immediate
I have used multiclass to achieve this. Previously I was attempting to 
match the opcode %a,%b,immediate general form.

Clearly I also need a way to load a 32 bit constant value into a register 
in order to be able to address  more than 64K of memory. I know the PPC 
does something similar ...

So for example for SABRE  this instruction output would perform the 
necessary ...
MOVri %a, HI16(32 bit constant)
LSHri %a,16
ORri %a, LO16(same 32 bit constant)
LD %d, %a // ie load the contents of the memory at the address stored in %a 
into register %d

where the HI/LO16 are performed at code generation by LLVM. I'm a little 
confused as to how to specify this as a pattern in tablegen syntax, even 
with the PPC example.

Apologies for the naivety of these questions.

Thanks,
         Andy



      Dr. Andy Nisbet: URL http://www.docm.mmu.ac.uk/STAFF/A.Nisbet
Department of Computing and Mathematics, John Dalton Building, Manchester
        Metropolitan University, Chester Street, Manchester M1 5GD, UK.
Email: A.Nisbet at mmu.ac.uk, Phone:(+44)-161-247-1556; Fax:(+44)-161-247-1483.

"Before acting on this email or opening any attachments you
should read the Manchester Metropolitan University's email
disclaimer available on its website
http://www.mmu.ac.uk/emaildisclaimer "

Christopher Lamb

2007-Mar-22 17:32 UTC

head link

[LLVMdev] Backend: 2 address + 17bit immediate

Hi Andy,

I've been working through a backend for the first time over the last  
several weeks, so I thought I'd share what insights I have into the  
subjects you mention.

On Mar 22, 2007, at 9:38 AM, Andy Nisbet wrote:
> I have spent some time looking at the PPC and SPARC backends, but  
> obviously
> these are much more complicated than what I require to implement.
> Consequently, I am not correctly grasping the interactions between
> ARCHInstrInfo.td and ARCHDAGToDAGISel.cpp I did manage to hack  
> something
> together based on a copy of SPARC (with a SABRE namespace etc) but the
> instruction selection was incorrect and I obtained a "Cannot yet
> select:0x..." assertion failure from SABREDAGToDAGIsel::SelectCode  
> when I
> attempted a
> llc -march sabre helloworld.bc -o helloworld.s
>
> Can anyone offer any guidance on how to proceed with debugging  
> instruction
> selection issues?
The *InstrInfo.td will be used to generate a file called  
*GenDAGIsel.inc in the build directory under lib/Target that you're  
working on. This is a C++ file that is included into the DAGIsel.cpp  
file and it contains the instruction selection rules specified in the  
*InstrInfo.td.

When instruction selection is performed control flow first enters the  
Select() method of your instruction selector object (usually named  
something like *DAGToDAGIsel.cpp), if that method doesn't select the  
dag it calls another method, SelectCode(), which calls into the  
tblgen generated instruction selection code that is in the .inc file.

The specific assert you mention is in the tblgen generated .inc file  
when it fails to find a pattern to match your dag. My suggestion  
would be to step through the Select() function and then into the .inc  
file to see where you instruction selection may be going awry.

> WRT defining the instructions themselves: am I right in thinking  
> that it is
> sensible (for instruction selection) to represent the instruction  
> set as a
> collection of instructions targetting register register and register
> immediate, so for example I would create defs for
> ADDrr to match ADD %a,%b
> ADDri to match ADD %a, immediate
> I have used multiclass to achieve this. Previously I was attempting to
> match the opcode %a,%b,immediate general form.
This seems the sensible way to proceed, to me.
> Clearly I also need a way to load a 32 bit constant value into a  
> register
> in order to be able to address  more than 64K of memory. I know the  
> PPC
> does something similar ...
>
> So for example for SABRE  this instruction output would perform the
> necessary ...
> MOVri %a, HI16(32 bit constant)
> LSHri %a,16
> ORri %a, LO16(same 32 bit constant)
> LD %d, %a // ie load the contents of the memory at the address  
> stored in %a
> into register %d
>
> where the HI/LO16 are performed at code generation by LLVM. I'm a  
> little
> confused as to how to specify this as a pattern in tablegen syntax,  
> even
> with the PPC example.
Code generating immediates like this for global variables addresses  
and constant pool addresses involves an interaction between  
instruction selection and the target lowering implementation.  
Generating numeric immediates only requires some patterns in the  
instruction selector. The key is that the global addresses and the  
numeric immediates follow two separate but similar paths to being  
code gen'd.

Numeric Immediates:

You'll need NodeXForm's in your InstrInfo.td that implement the LO16/ 
HI16 part, like:

def LO16 : SDNodeXForm<imm, [{
   // Transformation function: get the low 16 bits.
   return CurDAG->getTargetConstant((unsigned)N->getValue() & 0xFFFF,
MVT::i32);
}]>;

At this point numeric immediates are simply going to be a pattern like:

def : Pat<(i32 imm:$imm), (ORri (LSHri (MOVri RZero, (HI16 imm: 
$imm)), 16), (LO16 imm:$imm))>;

Global/Constant Pool Addresses:

I don't completely understand why it's not possible to simply  
instruction select these as one does with integer immediates, but all  
the targets I've looked at follow a similar approach that uses  
customer lowering of these values using two target specific dag  
nodes. If you look at the Sparc target lowering call (LowerOperation 
()) for ISD::GlobalAddress you'll see how it gets split into a dag  
that includes some target specific nodes (SPISD::Hi/Lo).

Then in the InstrInfo.td look for the the SDNode declarations for  
these target specific nodes and then the selection patterns that  
match them. It's pretty similar to the pattern for integer immediates.

Hope this helps.
--
Christopher Lamb
christopher.lamb at gmail.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20070322/4d18a9c7/attachment.html>

Possibly Parallel Threads

Search for more possibly parallel threads

llvm dev - Mar 2007 - [LLVMdev] Backend: 2 address + 17bit immediate

[LLVMdev] Backend: 2 address + 17bit immediate

[LLVMdev] Backend: 2 address + 17bit immediate

Possibly Parallel Threads