I filed a bug report (Bug 12205). Please take a look when you have time. Per your suggestion, I also attached a patch which attaches to load or store nodes a machinepointerinfo that points to a stack frame object when it can infer they are actually reading from or writing to the stack. The test that was failing passes if I apply this patch, but I doubt this is the right approach, because this will fail if InferPointerInfo in SelectionDAG.cpp cannot discover a load or store is accessing a stack object (it can only infer the information if the expression for the pointer is simple, for example add FI + const). An alternative approach might be to make the machinepointerinfo of the stores refer to %struct.ObjPointStruct* byval %P or refer to nothing, but that currently doesn't seem to be possible. On Tue, Mar 6, 2012 at 6:01 PM, Andrew Trick <atrick at apple.com> wrote:> On Mar 6, 2012, at 5:05 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: >> I am having trouble trying to enable post RA scheduler for the Mips backend. >> >> This is the bit code of the function I am compiling: >> >> (gdb) p MF.Fn->dump() >> >> define void @PointToHPoint(%struct.HPointStruct* noalias sret >> %agg.result, %struct.ObjPointStruct* byval %P) nounwind { >> entry: >> %res = alloca %struct.HPointStruct, align 8 >> %x2 = bitcast %struct.ObjPointStruct* %P to double* >> %0 = load double* %x2, align 8 >> >> The third instruction is loading the first floating point double of >> structure %P which is being passed by value. >> >> This is the machine function right after completion of isel: >> (gdb) p MF->dump() >> # Machine code for function PointToHPoint: >> Frame Objects: >> fi#-1: size=48, align=8, fixed, at location [SP+8] >> fi#0: size=32, align=8, at location [SP] >> Function Live Ins: %A0 in %vreg0, %A2 in %vreg1, %A3 in %vreg2 >> >> BB#0: derived from LLVM BB %entry >> SW %vreg2, <fi#-1>, 4; mem:ST4[FixedStack-1+4] CPURegs:%vreg2 >> SW %vreg1, <fi#-1>, 0; mem:ST4[FixedStack-1](align=8) CPURegs:%vreg1 >> %vreg3<def> = COPY %vreg0; CPURegs:%vreg3,%vreg0 >> %vreg4<def> = LDC1 <fi#-1>, 0; mem:LD8[%x2] AFGR64:%vreg4 >> >> >> The first two stores write the values in argument registers $6 and $7 >> to frame object -1 >> (Mips stores byval arguments passed in registers to the stack). >> The fourth instruction LDC1 loads the value written by the first two >> stores as a floating point double. >> >> This is the machine function just before post RA scheduling: >> (gdb) p MF.dump() >> # Machine code for function PointToHPoint: >> Frame Objects: >> fi#-1: size=48, align=8, fixed, at location [SP+8] >> fi#0: size=32, align=8, at location [SP-32] >> Function Live Ins: %A0 in %vreg0, %A2 in %vreg1, %A3 in %vreg2 >> >> BB#0: derived from LLVM BB %entry >> Live Ins: %A0 %A2 %A3 >> %SP<def> = ADDiu %SP, -32 >> PROLOG_LABEL <MCSym=$tmp0> >> SW %A3<kill>, %SP, 44; mem:ST4[FixedStack-1+4] >> SW %A2<kill>, %SP, 40; mem:ST4[FixedStack-1](align=8) >> %D0<def> = LDC1 %SP, 40; mem:LD8[%x2] >> >> >> The frame index operands of the first two stores and the fourth load >> have been lowered to real addresses. >> Since the first two SWs store to ($sp + 44) and ($sp + 40), and >> instruction LDC1 loads from ($sp + 40), >> there should be a dependency between these instructions. >> >> However, when ScheduleDAGInstrs::BuildSchedGraph(AliasAnalysis *AA) >> builds the schedule graph, >> there are no dependency edges added between the two SWs and LDC1 because >> getUnderlyingObjectForInstr returns different objects for these instructions: >> >> underlying object of SWs: FixedStack-1 >> underlying object of LDC1: struct.ObjPointStruct* %P >> >> >> Is this a bug? >> Or are there ways to tell BuildSchedGraph it should add dependency edges? > > This is a wild guess. But it looks to me like your load's machineMemOperand should have been converted to refer to the stack frame. I would call that an ISEL bug. I can't say where the bug is without stepping through a test case. > > Maybe someone who's worked in this area of ISEL can give you a better hint. In the meantime, I would file a PR. > > -Andy-------------- next part -------------- A non-text attachment was scrubbed... Name: machineptrinfo.patch Type: text/x-patch Size: 1123 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120307/20204c6d/attachment.bin>
On Mar 7, 2012, at 11:34 AM, Akira Hatanaka <ahatanak at gmail.com> wrote:> I filed a bug report (Bug 12205). > Please take a look when you have time. > > Per your suggestion, I also attached a patch which attaches to load or > store nodes a machinepointerinfo that points to a stack frame object > when it can infer they are actually reading from or writing to the > stack. The test that was failing passes if I apply this patch, but I > doubt this is the right approach, because this will fail if > InferPointerInfo in SelectionDAG.cpp cannot discover a load or store > is accessing a stack object (it can only infer the information if the > expression for the pointer is simple, for example add FI + const). > > An alternative approach might be to make the machinepointerinfo of the > stores refer to %struct.ObjPointStruct* byval %P or refer to nothing, > but that currently doesn't seem to be possible.I've thought of several ways we could potentially handle this. All are fairly messy without recognizing the situation during argument lowering. I'm not very familiar with the argument lowering code. But it seems to me you should be able to lookup the Value for the formal argument when you generate stack stores. Can you create a MachinePointerInfo for each store that refers to the argument value and proper offset? These initializers will no longer appear to alias with stack accesses, but that's probably ok. What exactly do you think is not possible? If finding the formal argument value and offset is too hard, I suppose there are other hacks you could try. I'm not encouraging it though. Is it valid to set MachinePointerInfo.V = 0? You could try overriding it after calling getStore. If that's not valid, you could probably create a PseudoSourceValue that aliases with everything. I suppose the hackiest thing would be marking the store volatile. The alternative would be to define a new MachineMemOperand flag. I really don't think we should have to go that far though. -Andy> On Tue, Mar 6, 2012 at 6:01 PM, Andrew Trick <atrick at apple.com> wrote: >> On Mar 6, 2012, at 5:05 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: >>> I am having trouble trying to enable post RA scheduler for the Mips backend. >>> >>> This is the bit code of the function I am compiling: >>> >>> (gdb) p MF.Fn->dump() >>> >>> define void @PointToHPoint(%struct.HPointStruct* noalias sret >>> %agg.result, %struct.ObjPointStruct* byval %P) nounwind { >>> entry: >>> %res = alloca %struct.HPointStruct, align 8 >>> %x2 = bitcast %struct.ObjPointStruct* %P to double* >>> %0 = load double* %x2, align 8 >>> >>> The third instruction is loading the first floating point double of >>> structure %P which is being passed by value. >>> >>> This is the machine function right after completion of isel: >>> (gdb) p MF->dump() >>> # Machine code for function PointToHPoint: >>> Frame Objects: >>> fi#-1: size=48, align=8, fixed, at location [SP+8] >>> fi#0: size=32, align=8, at location [SP] >>> Function Live Ins: %A0 in %vreg0, %A2 in %vreg1, %A3 in %vreg2 >>> >>> BB#0: derived from LLVM BB %entry >>> SW %vreg2, <fi#-1>, 4; mem:ST4[FixedStack-1+4] CPURegs:%vreg2 >>> SW %vreg1, <fi#-1>, 0; mem:ST4[FixedStack-1](align=8) CPURegs:%vreg1 >>> %vreg3<def> = COPY %vreg0; CPURegs:%vreg3,%vreg0 >>> %vreg4<def> = LDC1 <fi#-1>, 0; mem:LD8[%x2] AFGR64:%vreg4 >>> >>> >>> The first two stores write the values in argument registers $6 and $7 >>> to frame object -1 >>> (Mips stores byval arguments passed in registers to the stack). >>> The fourth instruction LDC1 loads the value written by the first two >>> stores as a floating point double. >>> >>> This is the machine function just before post RA scheduling: >>> (gdb) p MF.dump() >>> # Machine code for function PointToHPoint: >>> Frame Objects: >>> fi#-1: size=48, align=8, fixed, at location [SP+8] >>> fi#0: size=32, align=8, at location [SP-32] >>> Function Live Ins: %A0 in %vreg0, %A2 in %vreg1, %A3 in %vreg2 >>> >>> BB#0: derived from LLVM BB %entry >>> Live Ins: %A0 %A2 %A3 >>> %SP<def> = ADDiu %SP, -32 >>> PROLOG_LABEL <MCSym=$tmp0> >>> SW %A3<kill>, %SP, 44; mem:ST4[FixedStack-1+4] >>> SW %A2<kill>, %SP, 40; mem:ST4[FixedStack-1](align=8) >>> %D0<def> = LDC1 %SP, 40; mem:LD8[%x2] >>> >>> >>> The frame index operands of the first two stores and the fourth load >>> have been lowered to real addresses. >>> Since the first two SWs store to ($sp + 44) and ($sp + 40), and >>> instruction LDC1 loads from ($sp + 40), >>> there should be a dependency between these instructions. >>> >>> However, when ScheduleDAGInstrs::BuildSchedGraph(AliasAnalysis *AA) >>> builds the schedule graph, >>> there are no dependency edges added between the two SWs and LDC1 because >>> getUnderlyingObjectForInstr returns different objects for these instructions: >>> >>> underlying object of SWs: FixedStack-1 >>> underlying object of LDC1: struct.ObjPointStruct* %P >>> >>> >>> Is this a bug? >>> Or are there ways to tell BuildSchedGraph it should add dependency edges? >> >> This is a wild guess. But it looks to me like your load's machineMemOperand should have been converted to refer to the stack frame. I would call that an ISEL bug. I can't say where the bug is without stepping through a test case. >> >> Maybe someone who's worked in this area of ISEL can give you a better hint. In the meantime, I would file a PR. >> >> -Andy > <machineptrinfo.patch>
Thank you for your suggestions. I implemented the first approach (provided the byval argument and offset to MachinePointerInfo) and it seems to have fixed the instruction ordering problem. It was a lot simpler than initially expected. In this particular case, is the user responsible for providing alias information to MachinePointerInfo to guarantee instructions are emitted in the correct order? It seems to me that getStore should not try to infer pointer information unless the user explicitly asks for it. The scheduler will then conservatively treat it as a load or store that aliases anything. On Mon, Mar 12, 2012 at 10:39 PM, Andrew Trick <atrick at apple.com> wrote:> > On Mar 7, 2012, at 11:34 AM, Akira Hatanaka <ahatanak at gmail.com> wrote: > >> I filed a bug report (Bug 12205). >> Please take a look when you have time. >> >> Per your suggestion, I also attached a patch which attaches to load or >> store nodes a machinepointerinfo that points to a stack frame object >> when it can infer they are actually reading from or writing to the >> stack. The test that was failing passes if I apply this patch, but I >> doubt this is the right approach, because this will fail if >> InferPointerInfo in SelectionDAG.cpp cannot discover a load or store >> is accessing a stack object (it can only infer the information if the >> expression for the pointer is simple, for example add FI + const). >> >> An alternative approach might be to make the machinepointerinfo of the >> stores refer to %struct.ObjPointStruct* byval %P or refer to nothing, >> but that currently doesn't seem to be possible. > > I've thought of several ways we could potentially handle this. All are fairly messy without recognizing the situation during argument lowering. I'm not very familiar with the argument lowering code. But it seems to me you should be able to lookup the Value for the formal argument when you generate stack stores. Can you create a MachinePointerInfo for each store that refers to the argument value and proper offset? These initializers will no longer appear to alias with stack accesses, but that's probably ok. What exactly do you think is not possible? > > If finding the formal argument value and offset is too hard, I suppose there are other hacks you could try. I'm not encouraging it though. Is it valid to set MachinePointerInfo.V = 0? You could try overriding it after calling getStore. If that's not valid, you could probably create a PseudoSourceValue that aliases with everything. I suppose the hackiest thing would be marking the store volatile. The alternative would be to define a new MachineMemOperand flag. I really don't think we should have to go that far though. > > -Andy > >> On Tue, Mar 6, 2012 at 6:01 PM, Andrew Trick <atrick at apple.com> wrote: >>> On Mar 6, 2012, at 5:05 PM, Akira Hatanaka <ahatanak at gmail.com> wrote: >>>> I am having trouble trying to enable post RA scheduler for the Mips backend. >>>> >>>> This is the bit code of the function I am compiling: >>>> >>>> (gdb) p MF.Fn->dump() >>>> >>>> define void @PointToHPoint(%struct.HPointStruct* noalias sret >>>> %agg.result, %struct.ObjPointStruct* byval %P) nounwind { >>>> entry: >>>> %res = alloca %struct.HPointStruct, align 8 >>>> %x2 = bitcast %struct.ObjPointStruct* %P to double* >>>> %0 = load double* %x2, align 8 >>>> >>>> The third instruction is loading the first floating point double of >>>> structure %P which is being passed by value. >>>> >>>> This is the machine function right after completion of isel: >>>> (gdb) p MF->dump() >>>> # Machine code for function PointToHPoint: >>>> Frame Objects: >>>> fi#-1: size=48, align=8, fixed, at location [SP+8] >>>> fi#0: size=32, align=8, at location [SP] >>>> Function Live Ins: %A0 in %vreg0, %A2 in %vreg1, %A3 in %vreg2 >>>> >>>> BB#0: derived from LLVM BB %entry >>>> SW %vreg2, <fi#-1>, 4; mem:ST4[FixedStack-1+4] CPURegs:%vreg2 >>>> SW %vreg1, <fi#-1>, 0; mem:ST4[FixedStack-1](align=8) CPURegs:%vreg1 >>>> %vreg3<def> = COPY %vreg0; CPURegs:%vreg3,%vreg0 >>>> %vreg4<def> = LDC1 <fi#-1>, 0; mem:LD8[%x2] AFGR64:%vreg4 >>>> >>>> >>>> The first two stores write the values in argument registers $6 and $7 >>>> to frame object -1 >>>> (Mips stores byval arguments passed in registers to the stack). >>>> The fourth instruction LDC1 loads the value written by the first two >>>> stores as a floating point double. >>>> >>>> This is the machine function just before post RA scheduling: >>>> (gdb) p MF.dump() >>>> # Machine code for function PointToHPoint: >>>> Frame Objects: >>>> fi#-1: size=48, align=8, fixed, at location [SP+8] >>>> fi#0: size=32, align=8, at location [SP-32] >>>> Function Live Ins: %A0 in %vreg0, %A2 in %vreg1, %A3 in %vreg2 >>>> >>>> BB#0: derived from LLVM BB %entry >>>> Live Ins: %A0 %A2 %A3 >>>> %SP<def> = ADDiu %SP, -32 >>>> PROLOG_LABEL <MCSym=$tmp0> >>>> SW %A3<kill>, %SP, 44; mem:ST4[FixedStack-1+4] >>>> SW %A2<kill>, %SP, 40; mem:ST4[FixedStack-1](align=8) >>>> %D0<def> = LDC1 %SP, 40; mem:LD8[%x2] >>>> >>>> >>>> The frame index operands of the first two stores and the fourth load >>>> have been lowered to real addresses. >>>> Since the first two SWs store to ($sp + 44) and ($sp + 40), and >>>> instruction LDC1 loads from ($sp + 40), >>>> there should be a dependency between these instructions. >>>> >>>> However, when ScheduleDAGInstrs::BuildSchedGraph(AliasAnalysis *AA) >>>> builds the schedule graph, >>>> there are no dependency edges added between the two SWs and LDC1 because >>>> getUnderlyingObjectForInstr returns different objects for these instructions: >>>> >>>> underlying object of SWs: FixedStack-1 >>>> underlying object of LDC1: struct.ObjPointStruct* %P >>>> >>>> >>>> Is this a bug? >>>> Or are there ways to tell BuildSchedGraph it should add dependency edges? >>> >>> This is a wild guess. But it looks to me like your load's machineMemOperand should have been converted to refer to the stack frame. I would call that an ISEL bug. I can't say where the bug is without stepping through a test case. >>> >>> Maybe someone who's worked in this area of ISEL can give you a better hint. In the meantime, I would file a PR. >>> >>> -Andy >> <machineptrinfo.patch> >