Hal Finkel
2014-Oct-23 18:19 UTC
[LLVMdev] Question regarding getElementPtr/Addressing modes in backend
----- Original Message -----> From: "Bruce Hoult" <bruce at hoult.org> > To: "Johnny Val" <johnnydval at gmail.com> > Cc: "<llvmdev at cs.uiuc.edu>" <llvmdev at cs.uiuc.edu> > Sent: Thursday, October 23, 2014 8:31:35 AM > Subject: Re: [LLVMdev] Question regarding getElementPtr/Addressing modes in backend > > Many CPU instruction sets have "autoincrement" / "postincrement" > (pop) / "predecrement" (push) instructions that dereference a > pointer register and also alter it by the size of the load or store. > > > The PowerPC has "load byte/short/word with update" instructions that > add an arbitrary positive or negative literal offset (unrelated to > the load/store size) to the pointer register BEFORE the load or > store, and update the register with the final address. > > So maybe the PowerPC back end is worth looking at?Yes, and so it ARM (which has both pre- and post-increment loads/stores). For PowerPC, for example, this works as follows: In PPCISelLowering.cpp, we have this: // PowerPC has pre-inc load and store's. setIndexedLoadAction(ISD::PRE_INC, MVT::i1, Legal); setIndexedLoadAction(ISD::PRE_INC, MVT::i8, Legal); setIndexedLoadAction(ISD::PRE_INC, MVT::i16, Legal); setIndexedLoadAction(ISD::PRE_INC, MVT::i32, Legal); setIndexedLoadAction(ISD::PRE_INC, MVT::i64, Legal); setIndexedStoreAction(ISD::PRE_INC, MVT::i1, Legal); setIndexedStoreAction(ISD::PRE_INC, MVT::i8, Legal); setIndexedStoreAction(ISD::PRE_INC, MVT::i16, Legal); setIndexedStoreAction(ISD::PRE_INC, MVT::i32, Legal); setIndexedStoreAction(ISD::PRE_INC, MVT::i64, Legal); Then the DAGCombiner calls into the backend using the function PPCTargetLowering::getPreIndexedAddressParts, which tells DAGCombine when it can combine the the load and the increment into a singe pre/post-increment load or store. The pre-increment stores are instruction-selected in PPCInstrInfo.td (look for the pre_store patterns). The pre-increment loads are selected in PPCISelDAGToDAG.cpp (TableGen can't yet handle instructions returning multiple outputs), look for the code just after ISD::PRE_INC around line 1060.>From your formula below, it looks like you have a pre-increment load.-Hal> > > On Fri, Oct 24, 2014 at 1:30 AM, Johnny Val < johnnydval at gmail.com > > wrote: > > > > Hi Steve, > > Thanks for the tip regarding MIOperandInfo, I didn't think of that > part of the tablegen description. > > Sadly, I did actually mean: r1 = *(i0 += m0). > > So increment i0 by m0. Read memory the memory location "pointed" to > by i0. Store in r1. Sadly I am not too familiar with compiler > terminology, so I don't know if there is a proper term for such a > load. > > > > > > > On Thu, Oct 23, 2014 at 12:23 PM, Steve Montgomery < > stephen.montgomery3 at btinternet.com > wrote: > > > You might need to modify selectAddr so it doesn't fold the add node > but instead returns two operands: one for the pointer and one for > the offset. You can control the register classes for the two > components of the address using MIOperandInfo so your base pointer > can be IClass and the offset MClass. See SparcInstrInfo.td (and > probably others) for examples. > > I'm assuming you meant r1 = *(i0 + m0) rather than r1 = *(i0 += m0)? > I don't know how you'd achieve that latter. > > Steve Montgomery > > On 21 Oct 2014, at 17:15, Johnny Val < johnnydval at gmail.com > wrote: > > > > > Hi, > > > > I am writing a backend and having issues with properly lowering the > > result of getElementPtr ( specifically the add node that it > > generates). > > > > If we take this IR: > > > > %struct.rectangle = type { i24, i24 } > > > > ; Function Attrs: nounwind readonly > > define i24 @area(%struct.rectangle* nocapture readonly %r) #0 { > > entry: > > %width = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 0 > > %0 = load i24* %width, align 4, !tbaa !1 > > %height = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 > > 1 > > %1 = load i24* %height, align 4, !tbaa !6 > > %mul = mul nsw i24 %1, %0 > > ret i24 %mul > > } > > > > The DAG before isel would look like isel.png. > > > > I would then pattern match the load nodes with: > > > > [(set i24:$val, (load addr:$addr))] > > > > Where addr is a complex pattern. This is fine for the 0th element > > in the struct, as there is no offset so the resultant assembly > > would be (which works fine): > > > > r0 = *(i0) > > > > The issue I have is with the second load. I need the result to be: > > > > m0 = 4 > > r1 = *(i0 += m0) (offset is stored in a register(m0), and modifies > > the original pointer) > > > > rather than > > > > r1 = *(i0 + 4) (immediate) > > > > (The specific registers numbers don't matter). > > > > I'm not sure how to achieve that. Currently addr gets matched in > > SelectAddr in XYZISelDAGToDAG.cpp. It will match the add node and > > "combine" (I'm not sure this is the correct terminology) it into > > my LDR node. The result of this can be seen sched.png > > > > So instead of that TargetConst<4> I need a something that copies 4 > > into an M register, and then LDR uses that. > > > > I'm not sure if I should put logic in SelectAddr to create a > > virtual register and create a series of copyfromreg/copytoreg to > > copy the constant into an M register and then pass that to the > > node? I feel like that is fairly ugly (as no other back-end does > > this), but I'm not sure what a better way would be. > > > > I have tried looking through other backends for an example, but > > couldn't find something similar. That could be due to me not > > knowing where to look. > > > > Thanks in advance for any help. > > > > Johnny > > > > <isel.png><sched.png>_______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Johnny Val
2014-Oct-24 12:22 UTC
[LLVMdev] Question regarding getElementPtr/Addressing modes in backend
Thank you for all the help. I feel like I am pretty close to getting this working for my architecture. The issue I am having right now is that I cannot get the DAGCombiner to lower/combine a load/store into a pre-indexed load/store. I believe the "issue" is with this code(DAGCombiner.cpp:7841): // If the pointer is not an add/sub, or if it doesn't have multiple uses, bail // out. There is no reason to make this a preinc/predec. if ((Ptr.getOpcode() != ISD::ADD && Ptr.getOpcode() != ISD::SUB) || Ptr.getNode()->hasOneUse()) return false; Specifically the "Ptr.GetNode()->hasOneUse()" statement. It seems like a fair assumption to make as most architectures probably have the option for "normal" loads as well as preinc/predec loads. Sadly my architecture only supports preinc addressing when accessing memory. What would you suggest to do in this case? Create custom lowering for ISD::STORE/LOAD in my backend? Or do you think it's better to patch DAGCombiner.cpp to allow targets to disable that check? Cheers, Johnny On Thu, Oct 23, 2014 at 7:19 PM, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Bruce Hoult" <bruce at hoult.org> > > To: "Johnny Val" <johnnydval at gmail.com> > > Cc: "<llvmdev at cs.uiuc.edu>" <llvmdev at cs.uiuc.edu> > > Sent: Thursday, October 23, 2014 8:31:35 AM > > Subject: Re: [LLVMdev] Question regarding getElementPtr/Addressing modes > in backend > > > > Many CPU instruction sets have "autoincrement" / "postincrement" > > (pop) / "predecrement" (push) instructions that dereference a > > pointer register and also alter it by the size of the load or store. > > > > > > The PowerPC has "load byte/short/word with update" instructions that > > add an arbitrary positive or negative literal offset (unrelated to > > the load/store size) to the pointer register BEFORE the load or > > store, and update the register with the final address. > > > > So maybe the PowerPC back end is worth looking at? > > Yes, and so it ARM (which has both pre- and post-increment loads/stores). > > For PowerPC, for example, this works as follows: > > In PPCISelLowering.cpp, we have this: > > // PowerPC has pre-inc load and store's. > setIndexedLoadAction(ISD::PRE_INC, MVT::i1, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i8, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i16, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i32, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i64, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i1, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i8, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i16, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i32, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i64, Legal); > > Then the DAGCombiner calls into the backend using the function > PPCTargetLowering::getPreIndexedAddressParts, which tells DAGCombine when > it can combine the the load and the increment into a singe > pre/post-increment load or store. > > The pre-increment stores are instruction-selected in PPCInstrInfo.td (look > for the pre_store patterns). The pre-increment loads are selected in > PPCISelDAGToDAG.cpp (TableGen can't yet handle instructions returning > multiple outputs), look for the code just after ISD::PRE_INC around line > 1060. > > From your formula below, it looks like you have a pre-increment load. > > -Hal > > > > > > > On Fri, Oct 24, 2014 at 1:30 AM, Johnny Val < johnnydval at gmail.com > > > wrote: > > > > > > > > Hi Steve, > > > > Thanks for the tip regarding MIOperandInfo, I didn't think of that > > part of the tablegen description. > > > > Sadly, I did actually mean: r1 = *(i0 += m0). > > > > So increment i0 by m0. Read memory the memory location "pointed" to > > by i0. Store in r1. Sadly I am not too familiar with compiler > > terminology, so I don't know if there is a proper term for such a > > load. > > > > > > > > > > > > > > On Thu, Oct 23, 2014 at 12:23 PM, Steve Montgomery < > > stephen.montgomery3 at btinternet.com > wrote: > > > > > > You might need to modify selectAddr so it doesn't fold the add node > > but instead returns two operands: one for the pointer and one for > > the offset. You can control the register classes for the two > > components of the address using MIOperandInfo so your base pointer > > can be IClass and the offset MClass. See SparcInstrInfo.td (and > > probably others) for examples. > > > > I'm assuming you meant r1 = *(i0 + m0) rather than r1 = *(i0 += m0)? > > I don't know how you'd achieve that latter. > > > > Steve Montgomery > > > > On 21 Oct 2014, at 17:15, Johnny Val < johnnydval at gmail.com > wrote: > > > > > > > > > Hi, > > > > > > I am writing a backend and having issues with properly lowering the > > > result of getElementPtr ( specifically the add node that it > > > generates). > > > > > > If we take this IR: > > > > > > %struct.rectangle = type { i24, i24 } > > > > > > ; Function Attrs: nounwind readonly > > > define i24 @area(%struct.rectangle* nocapture readonly %r) #0 { > > > entry: > > > %width = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 0 > > > %0 = load i24* %width, align 4, !tbaa !1 > > > %height = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 > > > 1 > > > %1 = load i24* %height, align 4, !tbaa !6 > > > %mul = mul nsw i24 %1, %0 > > > ret i24 %mul > > > } > > > > > > The DAG before isel would look like isel.png. > > > > > > I would then pattern match the load nodes with: > > > > > > [(set i24:$val, (load addr:$addr))] > > > > > > Where addr is a complex pattern. This is fine for the 0th element > > > in the struct, as there is no offset so the resultant assembly > > > would be (which works fine): > > > > > > r0 = *(i0) > > > > > > The issue I have is with the second load. I need the result to be: > > > > > > m0 = 4 > > > r1 = *(i0 += m0) (offset is stored in a register(m0), and modifies > > > the original pointer) > > > > > > rather than > > > > > > r1 = *(i0 + 4) (immediate) > > > > > > (The specific registers numbers don't matter). > > > > > > I'm not sure how to achieve that. Currently addr gets matched in > > > SelectAddr in XYZISelDAGToDAG.cpp. It will match the add node and > > > "combine" (I'm not sure this is the correct terminology) it into > > > my LDR node. The result of this can be seen sched.png > > > > > > So instead of that TargetConst<4> I need a something that copies 4 > > > into an M register, and then LDR uses that. > > > > > > I'm not sure if I should put logic in SelectAddr to create a > > > virtual register and create a series of copyfromreg/copytoreg to > > > copy the constant into an M register and then pass that to the > > > node? I feel like that is fairly ugly (as no other back-end does > > > this), but I'm not sure what a better way would be. > > > > > > I have tried looking through other backends for an example, but > > > couldn't find something similar. That could be due to me not > > > knowing where to look. > > > > > > Thanks in advance for any help. > > > > > > Johnny > > > > > > > <isel.png><sched.png>_______________________________________________ > > > LLVM Developers mailing list > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141024/173eeee2/attachment.html>
Hal Finkel
2014-Oct-24 12:55 UTC
[LLVMdev] Question regarding getElementPtr/Addressing modes in backend
----- Original Message -----> From: "Johnny Val" <johnnydval at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Bruce Hoult" <bruce at hoult.org>, "llvmdev" <llvmdev at cs.uiuc.edu> > Sent: Friday, October 24, 2014 7:22:27 AM > Subject: Re: [LLVMdev] Question regarding getElementPtr/Addressing modes in backend > > > Thank you for all the help. > > I feel like I am pretty close to getting this working for my > architecture. The issue I am having right now is that I cannot get > the DAGCombiner to lower/combine a load/store into a pre-indexed > load/store. > > I believe the "issue" is with this code(DAGCombiner.cpp:7841): > > // If the pointer is not an add/sub, or if it doesn't have multiple > uses, bail > // out. There is no reason to make this a preinc/predec. > if ((Ptr.getOpcode() != ISD::ADD && Ptr.getOpcode() != ISD::SUB) || > Ptr.getNode()->hasOneUse()) > return false; > > > Specifically the "Ptr.GetNode()->hasOneUse()" statement. It seems > like a fair assumption to make as most architectures probably have > the option for "normal" loads as well as preinc/predec loads. Sadly > my architecture only supports preinc addressing when accessing > memory. > > What would you suggest to do in this case? Create custom lowering for > ISD::STORE/LOAD in my backend? Or do you think it's better to patch > DAGCombiner.cpp to allow targets to disable that check?Yes, you can custom lower the loads and stores and do the rewriting in the backend. You can also just write a separate load/store pattern that feed in a '0' to the base that gets updated. Regardless of how you do it, you should handle this in the backend. -Hal> > Cheers, > > Johnny > > > > > On Thu, Oct 23, 2014 at 7:19 PM, Hal Finkel < hfinkel at anl.gov > > wrote: > > > ----- Original Message ----- > > From: "Bruce Hoult" < bruce at hoult.org > > > To: "Johnny Val" < johnnydval at gmail.com > > > Cc: "< llvmdev at cs.uiuc.edu >" < llvmdev at cs.uiuc.edu > > > Sent: Thursday, October 23, 2014 8:31:35 AM > > Subject: Re: [LLVMdev] Question regarding getElementPtr/Addressing > > modes in backend > > > > Many CPU instruction sets have "autoincrement" / "postincrement" > > (pop) / "predecrement" (push) instructions that dereference a > > pointer register and also alter it by the size of the load or > > store. > > > > > > The PowerPC has "load byte/short/word with update" instructions > > that > > add an arbitrary positive or negative literal offset (unrelated to > > the load/store size) to the pointer register BEFORE the load or > > store, and update the register with the final address. > > > > So maybe the PowerPC back end is worth looking at? > > Yes, and so it ARM (which has both pre- and post-increment > loads/stores). > > For PowerPC, for example, this works as follows: > > In PPCISelLowering.cpp, we have this: > > // PowerPC has pre-inc load and store's. > setIndexedLoadAction(ISD::PRE_INC, MVT::i1, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i8, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i16, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i32, Legal); > setIndexedLoadAction(ISD::PRE_INC, MVT::i64, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i1, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i8, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i16, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i32, Legal); > setIndexedStoreAction(ISD::PRE_INC, MVT::i64, Legal); > > Then the DAGCombiner calls into the backend using the function > PPCTargetLowering::getPreIndexedAddressParts, which tells DAGCombine > when it can combine the the load and the increment into a singe > pre/post-increment load or store. > > The pre-increment stores are instruction-selected in PPCInstrInfo.td > (look for the pre_store patterns). The pre-increment loads are > selected in PPCISelDAGToDAG.cpp (TableGen can't yet handle > instructions returning multiple outputs), look for the code just > after ISD::PRE_INC around line 1060. > > From your formula below, it looks like you have a pre-increment load. > > -Hal > > > > > > > > > On Fri, Oct 24, 2014 at 1:30 AM, Johnny Val < johnnydval at gmail.com > > > > > wrote: > > > > > > > > Hi Steve, > > > > Thanks for the tip regarding MIOperandInfo, I didn't think of that > > part of the tablegen description. > > > > Sadly, I did actually mean: r1 = *(i0 += m0). > > > > So increment i0 by m0. Read memory the memory location "pointed" to > > by i0. Store in r1. Sadly I am not too familiar with compiler > > terminology, so I don't know if there is a proper term for such a > > load. > > > > > > > > > > > > > > On Thu, Oct 23, 2014 at 12:23 PM, Steve Montgomery < > > stephen.montgomery3 at btinternet.com > wrote: > > > > > > You might need to modify selectAddr so it doesn't fold the add node > > but instead returns two operands: one for the pointer and one for > > the offset. You can control the register classes for the two > > components of the address using MIOperandInfo so your base pointer > > can be IClass and the offset MClass. See SparcInstrInfo.td (and > > probably others) for examples. > > > > I'm assuming you meant r1 = *(i0 + m0) rather than r1 = *(i0 +> > m0)? > > I don't know how you'd achieve that latter. > > > > Steve Montgomery > > > > On 21 Oct 2014, at 17:15, Johnny Val < johnnydval at gmail.com > > > wrote: > > > > > > > > > Hi, > > > > > > I am writing a backend and having issues with properly lowering > > > the > > > result of getElementPtr ( specifically the add node that it > > > generates). > > > > > > If we take this IR: > > > > > > %struct.rectangle = type { i24, i24 } > > > > > > ; Function Attrs: nounwind readonly > > > define i24 @area(%struct.rectangle* nocapture readonly %r) #0 { > > > entry: > > > %width = getelementptr inbounds %struct.rectangle* %r, i16 0, i32 > > > 0 > > > %0 = load i24* %width, align 4, !tbaa !1 > > > %height = getelementptr inbounds %struct.rectangle* %r, i16 0, > > > i32 > > > 1 > > > %1 = load i24* %height, align 4, !tbaa !6 > > > %mul = mul nsw i24 %1, %0 > > > ret i24 %mul > > > } > > > > > > The DAG before isel would look like isel.png. > > > > > > I would then pattern match the load nodes with: > > > > > > [(set i24:$val, (load addr:$addr))] > > > > > > Where addr is a complex pattern. This is fine for the 0th element > > > in the struct, as there is no offset so the resultant assembly > > > would be (which works fine): > > > > > > r0 = *(i0) > > > > > > The issue I have is with the second load. I need the result to > > > be: > > > > > > m0 = 4 > > > r1 = *(i0 += m0) (offset is stored in a register(m0), and > > > modifies > > > the original pointer) > > > > > > rather than > > > > > > r1 = *(i0 + 4) (immediate) > > > > > > (The specific registers numbers don't matter). > > > > > > I'm not sure how to achieve that. Currently addr gets matched in > > > SelectAddr in XYZISelDAGToDAG.cpp. It will match the add node and > > > "combine" (I'm not sure this is the correct terminology) it into > > > my LDR node. The result of this can be seen sched.png > > > > > > So instead of that TargetConst<4> I need a something that copies > > > 4 > > > into an M register, and then LDR uses that. > > > > > > I'm not sure if I should put logic in SelectAddr to create a > > > virtual register and create a series of copyfromreg/copytoreg to > > > copy the constant into an M register and then pass that to the > > > node? I feel like that is fairly ugly (as no other back-end does > > > this), but I'm not sure what a better way would be. > > > > > > I have tried looking through other backends for an example, but > > > couldn't find something similar. That could be due to me not > > > knowing where to look. > > > > > > Thanks in advance for any help. > > > > > > Johnny > > > > > > > <isel.png><sched.png>_______________________________________________ > > > LLVM Developers mailing list > > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory > >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory