thr3ads.net - llvm dev - [LLVMdev] FP emulation [Oct 2006]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2006-Oct-08 23:58 UTC

[LLVMdev] tblgen multiclasses

For anyone interested, X86InstrSSE.td makes extensive use of multiclasses 
now if people are looking for examples other than the sparc backend.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Roman Levenstein

2006-Oct-09 08:33 UTC

head link

[LLVMdev] tblgen multiclasses

Hi Chris,

Thanks for this info. This provides even better and more advanced
examples of multiclass usage! 

But your previous explanations were so good that I implemented in my
backend last week almost the same that you've done now in the
X86InstrSSE.td. I even introduced isCommutable parameter to indicate
this property, just as you did. So, by now integer arithmetic and
general purpose instructions are implemented. I'm working on the FP
support now.

Some feedback about tblgen from my side, i.e. an LLVM newcomer with
quite some compiler construction experience: 
When it comes to tblgen descriptions and corresponding DAG selection
and lowering code to be written for a backend, I found the use of
InFlag, OutFlag and chains less understandable, very underspecified and
not (well ) documented. Even though they are used in all backends,
their semantics and correct use is far from obvious (even though I'm
not new to compiler writing). I spent most time on getting these things
right. And I learned that if they misbehave it has very fatal
consequences on the overall code selection process. Now it works, but I
still don't quite understand their overall semantics and don't feel
very confident about them. And I guess I'm not the only one. Therefore,
I would kindly ask to provide here on the developers list and may be
even in the docs a clear explanation of these concepts, giving
guidelines on their usage and probably a small, but understandable
example making use of them. I think such a description would make
creation of new backends much  easier and faster.

Best Regards,
 Roman

--- Chris Lattner <sabre at nondot.org> wrote:
> 
> For anyone interested, X86InstrSSE.td makes extensive use of
> multiclasses 
> now if people are looking for examples other than the sparc backend.
> 
> -Chris
> 
> -- 
> http://nondot.org/sabre/
> http://llvm.org/
> _______________________________________________

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Roman Levenstein

2006-Oct-09 22:18 UTC

head link

[LLVMdev] FP emulation

Hi,

I'm now ready to implement the FP support for my embedded target. 

My target supports only f64 at the moment.
Question: How can I tell LLVM that float is the same as double on my
target? May be by assigning the same register class to both MVT::f32
and MVT::f64?

But FP is supported only in the emulated mode, because the target does
not have any hardware support for FP. Therefore each FP operation is
supposed to be converted into a call of an assembler function
implementing a corresponding operation. All these FP operations
implemented in assembler always expect parameters on concrete f64
registers, i.e. %d0,%d1 and return their results in reg %d0. The value
of %d1 is clobbered by such calls. (actually %dX are pseudo regs, see
below).

1. Since these FP emulation functions takes operands on registers and
produce operands on registers without any further side-effects, they
look pretty much like real instructions. Thus I have the idea to
represent them in the tblgen instruction descriptions like
pseudo-instructions, where constraints define which concrete physical
%dX registers are to use. This would enfore correct register
allocation.

For example:
def FSUB64: I<0x11, (ops), "fsub64", [(set d0, (fsub d0, d1))]>,
           Imp<[d0,d1],[d0,d1]>; // Uses d0, d1 and defines d0,d1 

This seems to work, at least on simple test files. 
            
But I would also need a way to convert such a FSUB64 pseudo-instruction
into the assembler function call, e.g. "call __fsub64". At the moment
I
don't quite understand at which stage and how I should do it (lowering,
selection, combining??? ). What would be the easiest way to map it to
such a call instruction?

One issue with the described approach is a pretty inefficient code
resulting after the register allocation. For example, there are a lot
of instructions of the form "mov %d0, %d0", copying the register into
itself. My guess is that the following happens:
 before reg.alloc there are instructions of the form:
 mov %virtual_reg0, %d0 
 mov %virtual_reg1, %d1 
 fsub64
which ensure that operand constraints of the operation are fullfilled
and they are on the right registers. During the alloction register
allocator assigns the same physical register to the virtual register.
Therefore the code becomes:
 mov %d0, %d0 
 mov %d1, %d1 
 fsub64

But then there is no call to "useless copies elimination" pass or
peephole pass that would basically remove such copies. 

Question: Is there such a pass available in LLVM? Actually, it is also
interesting to know, why the regalloc does not eliminate such coalesced
moves itself? Wouldn't it make sense?

Does this idea of representing the emulated FP operation calls as
instructions as described above make some sense? Or do you see easier
or more useful ways to do it?

2. In reality, the processor has only 32bit regs. Therefore, any f64
value should be mapped to two 32bit registers. What is the best way to
achieve it? I guess this is a well-known kind of problem.

So far I was thinking about introducing some pseudo f64 registers, i.e.
%dX used above, and working with them in the instruction descriptions.
And then at the later stages, probably after lowering and selection,
expand them into pairs of load or store operations. 

But I'm not quite sure that this is a right way to go. I suspect that
something can be done using some form of EXPAND operation in the
lowering pass. For example, I see that assignments of f64 immediates to
globals is expanded by LLVM automatically into two 32bit stores, which
is very nice. May be it is also possible to do it for 64bit registers
as well?

OK, enough questions for today ;)

Thanks for any feedback, 
 Roman


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Chris Lattner

2006-Oct-09 23:00 UTC

head link

[LLVMdev] tblgen multiclasses

On Mon, 9 Oct 2006, Roman Levenstein wrote:> But your previous explanations were so good that I implemented in my
> backend last week almost the same that you've done now in the
> X86InstrSSE.td. I even introduced isCommutable parameter to indicate
> this property, just as you did. So, by now integer arithmetic and
> general purpose instructions are implemented. I'm working on the FP
> support now.
Great :)
> Some feedback about tblgen from my side, i.e. an LLVM newcomer with
> quite some compiler construction experience:
> When it comes to tblgen descriptions and corresponding DAG selection
> and lowering code to be written for a backend, I found the use of
> InFlag, OutFlag and chains less understandable, very underspecified and
> not (well) documented. Even though they are used in all backends,
> their semantics and correct use is far from obvious (even though I'm
> not new to compiler writing).
Right, it is unfortunate that the code generator isn't better documented 
:(.  Patches gratiously accepted :)
> I spent most time on getting these things right. And I learned that if 
> they misbehave it has very fatal consequences on the overall code 
> selection process. Now it works, but I still don't quite understand 
> their overall semantics and don't feel very confident about them. And I
> guess I'm not the only one. Therefore, I would kindly ask to provide 
> here on the developers list and may be even in the docs a clear 
> explanation of these concepts, giving guidelines on their usage and 
> probably a small, but understandable example making use of them. I think 
> such a description would make creation of new backends much easier and 
> faster.
Basically, flag operands are a hack used to handle resources that are not 
accurately modeled in the scheduler (e.g. condition codes, explicit 
register assignments, etc).  The basic idea of the flag operand is that 
they require the scheduler to keep the "flagged" nodes stuck together
in
the output machine instructions.

If you have a specific question, I'm more than happy to answer it,

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - Oct 2006 - [LLVMdev] FP emulation

[LLVMdev] tblgen multiclasses

[LLVMdev] tblgen multiclasses

[LLVMdev] FP emulation

[LLVMdev] tblgen multiclasses

Possibly Parallel Threads