For anyone interested, X86InstrSSE.td makes extensive use of multiclasses now if people are looking for examples other than the sparc backend. -Chris -- http://nondot.org/sabre/ http://llvm.org/
Hi Chris, Thanks for this info. This provides even better and more advanced examples of multiclass usage! But your previous explanations were so good that I implemented in my backend last week almost the same that you've done now in the X86InstrSSE.td. I even introduced isCommutable parameter to indicate this property, just as you did. So, by now integer arithmetic and general purpose instructions are implemented. I'm working on the FP support now. Some feedback about tblgen from my side, i.e. an LLVM newcomer with quite some compiler construction experience: When it comes to tblgen descriptions and corresponding DAG selection and lowering code to be written for a backend, I found the use of InFlag, OutFlag and chains less understandable, very underspecified and not (well ) documented. Even though they are used in all backends, their semantics and correct use is far from obvious (even though I'm not new to compiler writing). I spent most time on getting these things right. And I learned that if they misbehave it has very fatal consequences on the overall code selection process. Now it works, but I still don't quite understand their overall semantics and don't feel very confident about them. And I guess I'm not the only one. Therefore, I would kindly ask to provide here on the developers list and may be even in the docs a clear explanation of these concepts, giving guidelines on their usage and probably a small, but understandable example making use of them. I think such a description would make creation of new backends much easier and faster. Best Regards, Roman --- Chris Lattner <sabre at nondot.org> wrote:> > For anyone interested, X86InstrSSE.td makes extensive use of > multiclasses > now if people are looking for examples other than the sparc backend. > > -Chris > > -- > http://nondot.org/sabre/ > http://llvm.org/ > _________________________________________________________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Hi, I'm now ready to implement the FP support for my embedded target. My target supports only f64 at the moment. Question: How can I tell LLVM that float is the same as double on my target? May be by assigning the same register class to both MVT::f32 and MVT::f64? But FP is supported only in the emulated mode, because the target does not have any hardware support for FP. Therefore each FP operation is supposed to be converted into a call of an assembler function implementing a corresponding operation. All these FP operations implemented in assembler always expect parameters on concrete f64 registers, i.e. %d0,%d1 and return their results in reg %d0. The value of %d1 is clobbered by such calls. (actually %dX are pseudo regs, see below). 1. Since these FP emulation functions takes operands on registers and produce operands on registers without any further side-effects, they look pretty much like real instructions. Thus I have the idea to represent them in the tblgen instruction descriptions like pseudo-instructions, where constraints define which concrete physical %dX registers are to use. This would enfore correct register allocation. For example: def FSUB64: I<0x11, (ops), "fsub64", [(set d0, (fsub d0, d1))]>, Imp<[d0,d1],[d0,d1]>; // Uses d0, d1 and defines d0,d1 This seems to work, at least on simple test files. But I would also need a way to convert such a FSUB64 pseudo-instruction into the assembler function call, e.g. "call __fsub64". At the moment I don't quite understand at which stage and how I should do it (lowering, selection, combining??? ). What would be the easiest way to map it to such a call instruction? One issue with the described approach is a pretty inefficient code resulting after the register allocation. For example, there are a lot of instructions of the form "mov %d0, %d0", copying the register into itself. My guess is that the following happens: before reg.alloc there are instructions of the form: mov %virtual_reg0, %d0 mov %virtual_reg1, %d1 fsub64 which ensure that operand constraints of the operation are fullfilled and they are on the right registers. During the alloction register allocator assigns the same physical register to the virtual register. Therefore the code becomes: mov %d0, %d0 mov %d1, %d1 fsub64 But then there is no call to "useless copies elimination" pass or peephole pass that would basically remove such copies. Question: Is there such a pass available in LLVM? Actually, it is also interesting to know, why the regalloc does not eliminate such coalesced moves itself? Wouldn't it make sense? Does this idea of representing the emulated FP operation calls as instructions as described above make some sense? Or do you see easier or more useful ways to do it? 2. In reality, the processor has only 32bit regs. Therefore, any f64 value should be mapped to two 32bit registers. What is the best way to achieve it? I guess this is a well-known kind of problem. So far I was thinking about introducing some pseudo f64 registers, i.e. %dX used above, and working with them in the instruction descriptions. And then at the later stages, probably after lowering and selection, expand them into pairs of load or store operations. But I'm not quite sure that this is a right way to go. I suspect that something can be done using some form of EXPAND operation in the lowering pass. For example, I see that assignments of f64 immediates to globals is expanded by LLVM automatically into two 32bit stores, which is very nice. May be it is also possible to do it for 64bit registers as well? OK, enough questions for today ;) Thanks for any feedback, Roman __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
On Mon, 9 Oct 2006, Roman Levenstein wrote:> But your previous explanations were so good that I implemented in my > backend last week almost the same that you've done now in the > X86InstrSSE.td. I even introduced isCommutable parameter to indicate > this property, just as you did. So, by now integer arithmetic and > general purpose instructions are implemented. I'm working on the FP > support now.Great :)> Some feedback about tblgen from my side, i.e. an LLVM newcomer with > quite some compiler construction experience:> When it comes to tblgen descriptions and corresponding DAG selection > and lowering code to be written for a backend, I found the use of > InFlag, OutFlag and chains less understandable, very underspecified and > not (well) documented. Even though they are used in all backends, > their semantics and correct use is far from obvious (even though I'm > not new to compiler writing).Right, it is unfortunate that the code generator isn't better documented :(. Patches gratiously accepted :)> I spent most time on getting these things right. And I learned that if > they misbehave it has very fatal consequences on the overall code > selection process. Now it works, but I still don't quite understand > their overall semantics and don't feel very confident about them. And I > guess I'm not the only one. Therefore, I would kindly ask to provide > here on the developers list and may be even in the docs a clear > explanation of these concepts, giving guidelines on their usage and > probably a small, but understandable example making use of them. I think > such a description would make creation of new backends much easier and > faster.Basically, flag operands are a hack used to handle resources that are not accurately modeled in the scheduler (e.g. condition codes, explicit register assignments, etc). The basic idea of the flag operand is that they require the scheduler to keep the "flagged" nodes stuck together in the output machine instructions. If you have a specific question, I'm more than happy to answer it, -Chris -- http://nondot.org/sabre/ http://llvm.org/