MathOnNapkins .
2013-Nov-09 21:29 UTC
[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
Dear All, I am trying to custom lower 32-bit ISD::SHL and SHR in a backend for 6502 family CPUs. The particular subtarget has 16-bit registers at most, so a 32-bit result is not legal. Normally, if you mark this as "Legal" or "Expand", then it will expand the node into a more nodes as follows in an example: shl i32 %a , 2 => high_sdvalue = (or (shr %b, 14), (shl %c, 2) ) => low_sdvalue = (shl %b, 2) where %b would be the lower 16 bits of %a, and %c is the upper 16 bits of %a. Since this target doesn't have a barrel shifter, it would be a huge performance hit to construct the DAG in this way, due to the large number of right shifts). Thus, I wish to mark ISD::SHL/SHR as "Custom" in setOperationAction(), but I'm unsure of how to manually split the 32-bit shiftee SDValue into two 16-bit values. The standard logic in /lib/CodeGen/SelectionDAG/LegalizeTypes.cpp and LegalizeIntegerTypes.cpp has a member function called GetExpandedInteger(). Unfortunately, this is a private method, so I can't gain access to this without hacking on code outside of my target's (/lib/Target/x65 in this case). To be 100% clear, I actually already did attempt this route, and when compiling received an error that GetExpandedInteger was a private method. What I would ultimately like to lower this to is probably a 16-bit ISD::SHL_PARTS, which I'd then further lower to a sequence of pairs of { shift left, rotate left through carry} instructions. Example of how that would look in relation to the aforementioned example: asl %b rol %c asl %b rol %c There doesn't seem to be a set of standard node types analogous to ADD / ADDC / ADDE for shift operations, which is really what I'm after here. I would very much appreciate any advice or insight you could provide on this matter. Thanks, ~Dave Waggoner / MathOnNapkins -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131109/c5b92890/attachment.html>
Steve Montgomery
2013-Nov-10 17:50 UTC
[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
I had a similar problem with a backend for the 68HC12 family which also has no barrel shifter. Some 68HC12 CPUs support shift for just one of the 16-bit registers and only support rotation on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts. I defined pseudo-instructions in my InstrInfo.td that implemented 8 and 16-bit SHL, SRL and SRA with one variant of each for the shift count in a register and another for an immediate shift count. Some shifts by immediate values and all shifts by non-immediate values need to be implemented using a loop so I used a custom inserter. I added a pass to expand the remaining shift pseudos into shift and rotate sequences with masking where required (e.g. replacing SRL x,7 with ROL x,2) I've no idea if that's the best way but it seems to work for me. Steve On 9 Nov 2013, at 21:29, MathOnNapkins . <mathonnapkins at gmail.com> wrote:> Dear All, > > I am trying to custom lower 32-bit ISD::SHL and SHR in a backend for 6502 family CPUs. The particular subtarget has 16-bit registers at most, so a 32-bit result is not legal. Normally, if you mark this as "Legal" or "Expand", then it will expand the node into a more nodes as follows in an example: > > shl i32 %a , 2 > > => high_sdvalue = (or (shr %b, 14), (shl %c, 2) ) > => low_sdvalue = (shl %b, 2) > > where %b would be the lower 16 bits of %a, and %c is the upper 16 bits of %a. > > Since this target doesn't have a barrel shifter, it would be a huge performance hit to construct the DAG in this way, due to the large number of right shifts). Thus, I wish to mark ISD::SHL/SHR as "Custom" in setOperationAction(), but I'm unsure of how to manually split the 32-bit shiftee SDValue into two 16-bit values. The standard logic in /lib/CodeGen/SelectionDAG/LegalizeTypes.cpp and LegalizeIntegerTypes.cpp has a member function called GetExpandedInteger(). Unfortunately, this is a private method, so I can't gain access to this without hacking on code outside of my target's (/lib/Target/x65 in this case). To be 100% clear, I actually already did attempt this route, and when compiling received an error that GetExpandedInteger was a private method. > > What I would ultimately like to lower this to is probably a 16-bit ISD::SHL_PARTS, which I'd then further lower to a sequence of pairs of { shift left, rotate left through carry} instructions. Example of how that would look in relation to the aforementioned example: > > asl %b > rol %c > asl %b > rol %c > > There doesn't seem to be a set of standard node types analogous to ADD / ADDC / ADDE for shift operations, which is really what I'm after here. > > I would very much appreciate any advice or insight you could provide on this matter. > > Thanks, > ~Dave Waggoner / MathOnNapkins > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Steve Montgomery
2013-Nov-10 18:35 UTC
[LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
I forgot to mention that I used EXTRACT_ELEMENT in my backend to get the high and low parts of an SDValue. On 10 Nov 2013, at 17:50, Steve Montgomery <stephen.montgomery3 at btinternet.com> wrote:> I had a similar problem with a backend for the 68HC12 family which also has no barrel shifter. Some 68HC12 CPUs support shift for just one of the 16-bit registers and only support rotation on the 2 8-bit subregs of that 16-bit register. That means the only practical solution for 32-bit shifts is to lower to a libcall but my situation for 16-bit shifts sounds similar to yours for 32-bit shifts. > > I defined pseudo-instructions in my InstrInfo.td that implemented 8 and 16-bit SHL, SRL and SRA with one variant of each for the shift count in a register and another for an immediate shift count. Some shifts by immediate values and all shifts by non-immediate values need to be implemented using a loop so I used a custom inserter. I added a pass to expand the remaining shift pseudos into shift and rotate sequences with masking where required (e.g. replacing SRL x,7 with ROL x,2) > > I've no idea if that's the best way but it seems to work for me. > > Steve > > On 9 Nov 2013, at 21:29, MathOnNapkins . <mathonnapkins at gmail.com> wrote: > >> Dear All, >> >> I am trying to custom lower 32-bit ISD::SHL and SHR in a backend for 6502 family CPUs. The particular subtarget has 16-bit registers at most, so a 32-bit result is not legal. Normally, if you mark this as "Legal" or "Expand", then it will expand the node into a more nodes as follows in an example: >> >> shl i32 %a , 2 >> >> => high_sdvalue = (or (shr %b, 14), (shl %c, 2) ) >> => low_sdvalue = (shl %b, 2) >> >> where %b would be the lower 16 bits of %a, and %c is the upper 16 bits of %a. >> >> Since this target doesn't have a barrel shifter, it would be a huge performance hit to construct the DAG in this way, due to the large number of right shifts). Thus, I wish to mark ISD::SHL/SHR as "Custom" in setOperationAction(), but I'm unsure of how to manually split the 32-bit shiftee SDValue into two 16-bit values. The standard logic in /lib/CodeGen/SelectionDAG/LegalizeTypes.cpp and LegalizeIntegerTypes.cpp has a member function called GetExpandedInteger(). Unfortunately, this is a private method, so I can't gain access to this without hacking on code outside of my target's (/lib/Target/x65 in this case). To be 100% clear, I actually already did attempt this route, and when compiling received an error that GetExpandedInteger was a private method. >> >> What I would ultimately like to lower this to is probably a 16-bit ISD::SHL_PARTS, which I'd then further lower to a sequence of pairs of { shift left, rotate left through carry} instructions. Example of how that would look in relation to the aforementioned example: >> >> asl %b >> rol %c >> asl %b >> rol %c >> >> There doesn't seem to be a set of standard node types analogous to ADD / ADDC / ADDE for shift operations, which is really what I'm after here. >> >> I would very much appreciate any advice or insight you could provide on this matter. >> >> Thanks, >> ~Dave Waggoner / MathOnNapkins >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Apparently Analagous Threads
- [LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
- [LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
- [LLVMdev] [Target] Custom Lowering expansion of 32-bit ISD::SHL, ISD::SHR without barrel shifter
- [LLVMdev] Troubleshooting Internal Garbage Collection
- [LLVMdev] Troubleshooting Internal Garbage Collection