Hello, I have some questions regarding folding operations with symbols during the instruction print stage with MC. At the moment I'm working with global symbols but i guess that other symbol types should be equivalent. My first question is how can i negate the address of a symbol? Consider this piece of code: char g_var[80]; char foo(int a) { return g_var[a]; } this gets compiles into something like (in pseudo asm): addi a, g_var load retreg, a but i dont have an add with immediate instruction so i have to do the following subi a, -g_var // negate g_var addr load retreg, a A solution I thought could be passing a target flag indicating that a negation is needed when lowering the machineinstr into a MCInst, and adding a MCExpr to negate the symbol. But I want to know if there's a better way to do this, instead of delaying it to the stage of MCInst lowering. The other questions is how to fold single and complex operations on symbols, say we have something like: unsigned int g_var[80]; unsigned int foo() { return (unsigned int)&g_var[0] & 0x1234; } Currently this moves the g_var address into a register and then performs the and operation, but i want this to be done at compilation time, so we have something like: move retreg, (g_val & 0x1234) Without touching anything else only additions get folded, but this could be expanded into other operations like or, xor, shifts, etc.. A more complex case would be combining operations in a single statement. So my question is how to achieve this. As an idea I've thought of using a pseudo instruction that takes an operand depending of the instruction to fold, then expand this pseudo instr into the real move instruction by setting a target flag depending on the operation to fold, and in the MCInst lower stage create a MCExpr depending on these flags, but this has the problem that it can't handle more than one operation per statement. Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110426/753b2356/attachment.html>
Hello, On Apr 26, 2011, at 6:30 AM, Borja Ferrer wrote:> Hello, I have some questions regarding folding operations with symbols during the instruction print stage with MC. At the moment I'm working with global symbols but i guess that other symbol types should be equivalent. > > My first question is how can i negate the address of a symbol? > > Consider this piece of code: > char g_var[80]; > char foo(int a) { return g_var[a]; } > > this gets compiles into something like (in pseudo asm): > addi a, g_var > load retreg, a > > but i dont have an add with immediate instruction so i have to do the following > subi a, -g_var // negate g_var addr > load retreg, a > > A solution I thought could be passing a target flag indicating that a negation is needed when lowering the machineinstr into a MCInst, and adding a MCExpr to negate the symbol. But I want to know if there's a better way to do this, instead of delaying it to the stage of MCInst lowering. >These sorts of constraints are normally enforced at prior to lowering to MC. Doing them directly as part of instruction selection as much as possible is good (the ARM target has examples of this for using ADD/SUB immediate instructions). For example, don't express in the target .td file(s) that you have an add-immediate instruction if you actually don't, but do add patterns for the operation using the subtract-immediate instruction. For symbolic immediate references, you're correct that the expression on the operand will include the negation. MC is designed such that it should always represent legal instructions, and only legal instructions. That includes things like register operands being legal for the instruction, immediates being in range, etc.. There's (currently) no verification pass for those constraints, but that's the idea, so waiting 'til after MC lowering to check for and transform the instructions is not preferable and likely to break if/when we add such a verification pass. If your target has properties that make it impossible to do this at instruction selection time, I would suggest a late machine function pass that will scan for and transform the instructions as necessary. This would all be at the MachineInstr level before lowering to MC.> The other questions is how to fold single and complex operations on symbols, say we have something like: > > unsigned int g_var[80]; > unsigned int foo() { return (unsigned int)&g_var[0] & 0x1234; } > > Currently this moves the g_var address into a register and then performs the and operation, but i want this to be done at compilation time, so we have something like: > > move retreg, (g_val & 0x1234) >For many targets this isn't legal, as the object file format used can't represent those sorts of expressions in a relocation. It sounds like your situation is different, though.> Without touching anything else only additions get folded, but this could be expanded into other operations like or, xor, shifts, etc.. A more complex case would be combining operations in a single statement. So my question is how to achieve this. As an idea I've thought of using a pseudo instruction that takes an operand depending of the instruction to fold, then expand this pseudo instr into the real move instruction by setting a target flag depending on the operation to fold, and in the MCInst lower stage create a MCExpr depending on these flags, but this has the problem that it can't handle more than one operation per statement.A custom lowering or a target DAG combine would likely be your best bet. Regards, Jim
Hello Jim thanks for the reply, For normal additions with immediates I've done the same as ARM does, basically transforming add(x, imm) nodes to sub(x, -imm) with a pattern in the .td file like this: def : Pat<(add DLDREGS:$src1, imm:$src2), (SUBIWRdK DLDREGS:$src1, (imm16_neg_XFORM imm:$src2))>; Now, the typical pattern concerning additions with global addresses looks like this: (taken from x86) def : Pat<(add GR32:$src1, (X86Wrapper tglobaladdr :$src2)), (ADD32ri GR32:$src1, tglobaladdr:$src2)>; but i can't write that since i dont have an add with imm instr, and doing: def : Pat<(add DREGS:$src, (Wrapper tglobaladdr:$src2)), (SUBIWRdK DREGS:$src, tglobaladdr:$src2)>; is wrong because the tglobaladdr has to be negated somehow, so i don't understand how should I negate the symbol reference using patterns, if it's even possible. The obvious hack is adding a "-" char when lowering the symbol reference into text. Regarding my second question, as you mentioned all symbols have static addresses so no relocations are performed, so it should be safe to fold immediate operations with the symbol reference. My problem here is that i don't know how to fold an arbitrary expression on a global (initially in the form of a DAG) to something that can be translated later into an expression with MC. It's something weird because operations are performed in the operand of an instruction, and since it has to support any arbitrary expression you can't have all combinations of operations using custom instructions. So how should i proceed in here using custom lowering or target dag combines? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20110426/9289d36a/attachment.html>