Alex Susu via llvm-dev
2016-Oct-24 23:30 UTC
[llvm-dev] Instruction selection confusion at register - chooses vector register instead of scalar one
Hello. I have extended the BPF back end with vector registers (inspiring from Mips MSA) - something like this: def MSA128D: RegisterClass<"Connex", [v128i16], 32, (sequence "Wh%u", 0, 31)>; I also added vector store and load instructions in the style of Mips MSA - see https://github.com/llvm-mirror/llvm/blob/master/lib/Target/Mips/MipsMSAInstrInfo.td, look for "def ST_D", etc. Note however that my vector unit has a separate memory space. This is why I defined the vector store like: class ST_DESC_BASE<string instr_asm, SDPatternOperator OpNode, ValueType TyNode, RegisterOperand ROWD, Operand MemOpnd = uimm4_ptr, ImmLeaf Addr = immLeafAlex, InstrItinClass itin = NoItinerary> { dag OutOperandList = (outs); dag InOperandList = (ins ROWD:$wd, MemOpnd:$addrdst); string AsmString = !strconcat("LS[$addrdst] = $wd;", instr_asm); list<dag> Pattern = [(OpNode (TyNode ROWD:$wd), Addr:$addrdst)]; InstrItinClass Itinerary = itin; string DecoderMethod = "DecodeMSA128Mem"; } Also, BPF has its own scalar stores and loads (with the standard i64 registers), for example (from https://github.com/llvm-mirror/llvm/blob/master/lib/Target/BPF/BPFInstrInfo.td): class STOREi64<bits<2> Opc, string OpcodeStr, PatFrag OpNode> : STORE<Opc, OpcodeStr, [(OpNode i64:$src, ADDRri:$addr)]>; However, spills and loads with vector registers, created automatically at the border of basic-blocks use the scalar stores and loads and NOT the vector ones that are also defined. For example, I obtain this ASM code when compiling with my LLVM: std -512(r10), R(0) ; end of predecessor BB ... ; beginning of current BB ldd R(0), -512(r10) As we can see STOREi64 takes i64 scalar register normally, but it confuses a v128i16 register R(0) with an i64 scalar one (r0-r31)... Could you please tell me if there is an easy way to fix this? I guess the problem is related to the fact the vector unit has its own memory space and I guess LLVM spills normally registers on the stack - if so can I specify a different spill region for the vector register? Thank you, Alex
Arsenault, Matthew via llvm-dev
2016-Oct-25 02:29 UTC
[llvm-dev] Instruction selection confusion at register - chooses vector register instead of scalar one
Spills created at the end of the block (I assume you mean what fast regalloc does at -O0) are created long after instruction selection. In that case it sounds like your implementation of storeRegToStackSlot/loadRegFromStackSlot is broken -Matt On Tue, Oct 25, 2016 at 7:30 AM +0800, "Alex Susu via llvm-dev" <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hello. I have extended the BPF back end with vector registers (inspiring from Mips MSA) - something like this: def MSA128D: RegisterClass<"Connex", [v128i16], 32, (sequence "Wh%u", 0, 31)>; I also added vector store and load instructions in the style of Mips MSA - see https://github.com/llvm-mirror/llvm/blob/master/lib/Target/Mips/MipsMSAInstrInfo.td, look for "def ST_D", etc. Note however that my vector unit has a separate memory space. This is why I defined the vector store like: class ST_DESC_BASE<string instr_asm, SDPatternOperator OpNode, ValueType TyNode, RegisterOperand ROWD, Operand MemOpnd = uimm4_ptr, ImmLeaf Addr = immLeafAlex, InstrItinClass itin = NoItinerary> { dag OutOperandList = (outs); dag InOperandList = (ins ROWD:$wd, MemOpnd:$addrdst); string AsmString = !strconcat("LS[$addrdst] = $wd;", instr_asm); list<dag> Pattern = [(OpNode (TyNode ROWD:$wd), Addr:$addrdst)]; InstrItinClass Itinerary = itin; string DecoderMethod = "DecodeMSA128Mem"; } Also, BPF has its own scalar stores and loads (with the standard i64 registers), for example (from https://github.com/llvm-mirror/llvm/blob/master/lib/Target/BPF/BPFInstrInfo.td): class STOREi64<bits<2> Opc, string OpcodeStr, PatFrag OpNode> : STORE<Opc, OpcodeStr, [(OpNode i64:$src, ADDRri:$addr)]>; However, spills and loads with vector registers, created automatically at the border of basic-blocks use the scalar stores and loads and NOT the vector ones that are also defined. For example, I obtain this ASM code when compiling with my LLVM: std -512(r10), R(0) ; end of predecessor BB ... ; beginning of current BB ldd R(0), -512(r10) As we can see STOREi64 takes i64 scalar register normally, but it confuses a v128i16 register R(0) with an i64 scalar one (r0-r31)... Could you please tell me if there is an easy way to fix this? I guess the problem is related to the fact the vector unit has its own memory space and I guess LLVM spills normally registers on the stack - if so can I specify a different spill region for the vector register? Thank you, Alex _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161025/7096f4f1/attachment.html>
Alex Susu via llvm-dev
2016-Oct-27 21:46 UTC
[llvm-dev] Instruction selection confusion at register - chooses vector register instead of scalar one
Hello. Matt, thanks for pointing me to the methods storeRegToStackSlot and loadRegFromStackSlot of [Target]InstrInfo class. It turned out that indeed they are the ones responsible for the problem with the malformed instruction with the confusion, as discussed in the first email. I have one more question: can I optimize inter-block register allocation by avoiding spilling (and the associated load) at the end of a basic-block with only one successor. More exactly, I have vector.body.preheader followed by vector.body (vector.body has as successor itself and an exit block, which is empty). Can I tell LLVM to do register allocation s.t. it avoids spilling at the end of vector.body.preheader and avoids performing corresponding loads at the beginning of vector.body? Thank you, Alex On 10/25/2016 5:29 AM, Arsenault, Matthew wrote:> Spills created at the end of the block (I assume you mean what fast regalloc does at -O0) > are created long after instruction selection. In that case it sounds like your > implementation of storeRegToStackSlot/loadRegFromStackSlot is broken > > -Matt > > > > > > On Tue, Oct 25, 2016 at 7:30 AM +0800, "Alex Susu via llvm-dev" <llvm-dev at lists.llvm.org > <mailto:llvm-dev at lists.llvm.org>> wrote: > > Hello. > I have extended the BPF back end with vector registers (inspiring from Mips MSA) - > something like this: > def MSA128D: RegisterClass<"Connex", [v128i16], 32, > (sequence "Wh%u", 0, 31)>; > I also added vector store and load instructions in the style of Mips MSA - see > https://github.com/llvm-mirror/llvm/blob/master/lib/Target/Mips/MipsMSAInstrInfo.td, look > for "def ST_D", etc. > Note however that my vector unit has a separate memory space. This is why I defined > the vector store like: > class ST_DESC_BASE<string instr_asm, SDPatternOperator OpNode, > ValueType TyNode, RegisterOperand ROWD, > Operand MemOpnd = uimm4_ptr, ImmLeaf Addr = immLeafAlex, > InstrItinClass itin = NoItinerary> { > dag OutOperandList = (outs); > dag InOperandList = (ins ROWD:$wd, MemOpnd:$addrdst); > string AsmString = !strconcat("LS[$addrdst] = $wd;", > instr_asm); > list<dag> Pattern = [(OpNode (TyNode ROWD:$wd), Addr:$addrdst)]; > InstrItinClass Itinerary = itin; > string DecoderMethod = "DecodeMSA128Mem"; > } > > Also, BPF has its own scalar stores and loads (with the standard i64 registers), for > example (from https://github.com/llvm-mirror/llvm/blob/master/lib/Target/BPF/BPFInstrInfo.td): > class STOREi64<bits<2> Opc, string OpcodeStr, PatFrag OpNode> > : STORE<Opc, OpcodeStr, [(OpNode i64:$src, ADDRri:$addr)]>; > > However, spills and loads with vector registers, created automatically at the border > of basic-blocks use the scalar stores and loads and NOT the vector ones that are also > defined. For example, I obtain this ASM code when compiling with my LLVM: > std -512(r10), R(0) > ; end of predecessor BB > ... > > ; beginning of current BB > ldd R(0), -512(r10) > > As we can see STOREi64 takes i64 scalar register normally, but it confuses a v128i16 > register R(0) with an i64 scalar one (r0-r31)... > > Could you please tell me if there is an easy way to fix this? I guess the problem is > related to the fact the vector unit has its own memory space and I guess LLVM spills > normally registers on the stack - if so can I specify a different spill region for the > vector register? > > Thank you, > Alex > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Possibly Parallel Threads
- Instruction selection confusion at register - chooses vector register instead of scalar one
- Immediate operand for load instruction, in back end
- TableGen - Help to implement a form of gather/scatter operations for Mips MSA
- TableGen - Help to implement a form of gather/scatter operations for Mips MSA
- TableGen - Help to implement a form of gather/scatter operations for Mips MSA