Chad Rosier
2015-Feb-19 18:06 UTC
[LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid
Hi All, I've encountered an issue where tail merging MIs is causing a problem with the post-RA MI scheduler dependency analysis and I'm not sure of the best way to address the problem. In my case, the branch folding pass (lib/CodeGen/BranchFolding.cpp) is merging common code from BB#14 and BB#15 into BB#16. It's clear that there are 4 common instructions (marked with an *) in BB#14 and BB#15: -------------------------------------------------------------------------------- BB#14: derived from LLVM BB %if.end.1 Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 %X1 %X10 %X11 %X12 %X13 %X9 Predecessors according to CFG: BB#12 %X5<def> = ADDXrr %X16, %X13 * %W19<def> = LDRBBui %X5, 1; mem:LD1[%scevgep95](tbaa=<0x6e02518>) * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR * %W2<def> = SUBWrr %W3<kill>, %W19<kill> * STRBBui %W2<kill>, %X5<kill>, 1; mem:ST1[%scevgep95](tbaa=<0x6e02518>) Successors according to CFG: BB#16 BB#15: derived from LLVM BB %L20.1 Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 %X1 %X10 %X11 %X12 %X13 %X9 Predecessors according to CFG: BB#12 %X5<def> = ADDXrr %X17, %X13 * %W19<def> = LDRBBui %X5, 1; mem:LD1[%scevgep91](tbaa=<0x6e02518>) * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR * %W2<def> = SUBWrr %W3<kill>, %W19<kill> * STRBBui %W2<kill>, %X5<kill>, 1; mem:ST1[%scevgep91](tbaa=<0x6e02518>) Successors according to CFG: BB#16 -------------------------------------------------------------------------------- The pass splits the edges between BB#14/BB#15 and inserts the common code into a new common successor BB#17. BB#17 is then merged into BB#16: -------------------------------------------------------------------------------- BB#16: derived from LLVM BB %L30.1 Predecessors according to CFG: BB#15 BB#14 %W19<def> = LDRBBui %X5, 1; mem:LD1[%scevgep91](tbaa=<0x6e02518>) (BB#15) %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR (BB#15) %W2<def> = SUBWrr %W3<kill>, %W19<kill> (BB#15) STRBBui %W2<kill>, %X5<kill>, 1; mem:ST1[%scevgep91](tbaa=<0x6e02518>) (BB#15) %W7<def> = LDRBBui %X7<kill>, 1; mem:LD1[%scevgep99](tbaa=<0x6e02518>) %W0<def> = LDRSBWui %X0<kill>, 1; mem:LD1[%scevgep101](tbaa=<0x6e02518>) %W6<def> = LDRBBui %X6<kill>, 1; mem:LD1[%scevgep103](tbaa=<0x6e02518>) %W5<def> = MADDWrrr %W6<kill>, %W0<kill>, %W7<kill> %X9<def> = ADDXri %X9<kill>, 2, 0 %X13<def> = ADDXri %X13<kill>, 2, 0 %WZR<def,dead> = SUBSWrr %W4, %W8, %NZCV<imp-def>, %X4<imp-use,kill> STRBBui %W5<kill>, %X18<kill>, 1; mem:ST1[%31](tbaa=<0x6e02518>) Bcc 1, <BB#9>, %NZCV<imp-use> Successors according to CFG: BB#13(4) BB#9(124) -------------------------------------------------------------------------------- This transformation looks fine, IMO. However, notice that the merged instructions are derived from BB#15. The memory operations in BB#15 were accessing array 'C', while the memory operations in BB#14 were accessing array 'B'. The next two loads %W7<def> = LDRBBui %X7<kill>, 1; mem:LD1[%scevgep99](tbaa=<0x6e02518>) %W0<def> = LDRSBWui %X0<kill>, 1; mem:LD1[%scevgep101](tbaa=<0x6e02518>) load from array 'B' and 'C', respectively. The problem occurs because ScheduleDAGInstrs does not correctly identify the true dependency between the STRBBui %W2<kill>, %X5<kill>, 1; mem:ST1[%scevgep91](tbaa=<0x6e02518>) and %W0<def> = LDRSBWui %X0<kill>, 1; mem:LD1[%scevgep101](tbaa=<0x6e02518>) instructions. AFAICT, the problem is that the Values associated with the MI's memory operands are used to determine the underlying objects accessed by a memory operation (See getUnderlyingObjectsForInstr() in ScheduleDAGInstrs.cpp). In turn the dependency analysis relies on these Values. When the STRBBui is merged into BB#16 it is derived from BB#15. Thus, the STRBBui associated Value is 'C'. The dependency analysis detects the dependency on array 'C', but misses the dependency on 'B'. Later, the scheduler hoists the LDRSBWui above the STRBBui and violates a true dependency. Boo.. Assuming my analysis is correct, the first solution that comes to mind is to disable tail merging when we encounter a memory operation (unless they access the same objects). Thoughts? I'm not particularly fond of inspecting the Values associated with MI operands for dependency analysis, but it's what we have now.. I would like to know everyones thoughts. Is there a more robust solution I'm missing? What happens if another MI-level pass performs a similar type of merging? Chad
Hal Finkel
2015-Feb-19 18:23 UTC
[LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid
----- Original Message -----> From: "Chad Rosier" <mcrosier at codeaurora.org> > To: llvmdev at cs.uiuc.edu > Sent: Thursday, February 19, 2015 12:06:10 PM > Subject: [LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid > > Hi All, > I've encountered an issue where tail merging MIs is causing a problem > with > the post-RA MI scheduler dependency analysis and I'm not sure of the > best > way to address the problem. > > In my case, the branch folding pass (lib/CodeGen/BranchFolding.cpp) > is > merging common code from BB#14 and BB#15 into BB#16. It's clear that > there are 4 common instructions (marked with an *) in BB#14 and > BB#15: > > -------------------------------------------------------------------------------- > BB#14: derived from LLVM BB %if.end.1 > Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 > %X1 > %X10 %X11 %X12 %X13 %X9 > Predecessors according to CFG: BB#12 > %X5<def> = ADDXrr %X16, %X13 > * %W19<def> = LDRBBui %X5, 1; > mem:LD1[%scevgep95](tbaa=<0x6e02518>) > * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR > * %W2<def> = SUBWrr %W3<kill>, %W19<kill> > * STRBBui %W2<kill>, %X5<kill>, 1; > mem:ST1[%scevgep95](tbaa=<0x6e02518>) > Successors according to CFG: BB#16 > > BB#15: derived from LLVM BB %L20.1 > Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 > %X1 > %X10 %X11 %X12 %X13 %X9 > Predecessors according to CFG: BB#12 > %X5<def> = ADDXrr %X17, %X13 > * %W19<def> = LDRBBui %X5, 1; > mem:LD1[%scevgep91](tbaa=<0x6e02518>) > * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR > * %W2<def> = SUBWrr %W3<kill>, %W19<kill> > * STRBBui %W2<kill>, %X5<kill>, 1; > mem:ST1[%scevgep91](tbaa=<0x6e02518>) > Successors according to CFG: BB#16 > -------------------------------------------------------------------------------- > > The pass splits the edges between BB#14/BB#15 and inserts the common > code > into a new common successor BB#17. BB#17 is then merged into BB#16: > > -------------------------------------------------------------------------------- > BB#16: derived from LLVM BB %L30.1 > Predecessors according to CFG: BB#15 BB#14 > %W19<def> = LDRBBui %X5, 1; > mem:LD1[%scevgep91](tbaa=<0x6e02518>) > (BB#15) > %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR > (BB#15) > %W2<def> = SUBWrr %W3<kill>, %W19<kill> > (BB#15) > STRBBui %W2<kill>, %X5<kill>, 1; > mem:ST1[%scevgep91](tbaa=<0x6e02518>) (BB#15) > > %W7<def> = LDRBBui %X7<kill>, 1; > mem:LD1[%scevgep99](tbaa=<0x6e02518>) > %W0<def> = LDRSBWui %X0<kill>, 1; > mem:LD1[%scevgep101](tbaa=<0x6e02518>) > %W6<def> = LDRBBui %X6<kill>, 1; > mem:LD1[%scevgep103](tbaa=<0x6e02518>) > %W5<def> = MADDWrrr %W6<kill>, %W0<kill>, %W7<kill> > %X9<def> = ADDXri %X9<kill>, 2, 0 > %X13<def> = ADDXri %X13<kill>, 2, 0 > %WZR<def,dead> = SUBSWrr %W4, %W8, %NZCV<imp-def>, > %X4<imp-use,kill> > STRBBui %W5<kill>, %X18<kill>, 1; > mem:ST1[%31](tbaa=<0x6e02518>) > Bcc 1, <BB#9>, %NZCV<imp-use> > Successors according to CFG: BB#13(4) BB#9(124) > -------------------------------------------------------------------------------- > > This transformation looks fine, IMO. However, notice that the merged > instructions are derived from BB#15. The memory operations in BB#15 > were > accessing array 'C', while the memory operations in BB#14 were > accessing > array 'B'. > > The next two loads > > %W7<def> = LDRBBui %X7<kill>, 1; > mem:LD1[%scevgep99](tbaa=<0x6e02518>) > %W0<def> = LDRSBWui %X0<kill>, 1; > mem:LD1[%scevgep101](tbaa=<0x6e02518>) > > load from array 'B' and 'C', respectively. The problem occurs > because > ScheduleDAGInstrs does not correctly identify the true dependency > between > the > > STRBBui %W2<kill>, %X5<kill>, 1; > mem:ST1[%scevgep91](tbaa=<0x6e02518>) > > and > > %W0<def> = LDRSBWui %X0<kill>, 1; > mem:LD1[%scevgep101](tbaa=<0x6e02518>) > > instructions. AFAICT, the problem is that the Values associated with > the > MI's memory operands are used to determine the underlying objects > accessed > by a memory operation (See getUnderlyingObjectsForInstr() in > ScheduleDAGInstrs.cpp). In turn the dependency analysis relies on > these > Values. > > When the STRBBui is merged into BB#16 it is derived from BB#15. > Thus, the > STRBBui associated Value is 'C'. The dependency analysis detects the > dependency on array 'C', but misses the dependency on 'B'. Later, > the > scheduler hoists the LDRSBWui above the STRBBui and violates a true > dependency. Boo.. > > Assuming my analysis is correct, the first solution that comes to > mind is > to disable tail merging when we encounter a memory operation (unless > they > access the same objects). Thoughts? > > I'm not particularly fond of inspecting the Values associated with MI > operands for dependency analysis, but it's what we have now.. > > I would like to know everyones thoughts. Is there a more robust > solution > I'm missing? What happens if another MI-level pass performs a > similar > type of merging?You can also remove the MMOs when you tail merge, or add multiple MMOs. Currently the two are equivalent because we ignore all MMOs on instructions with multiple MMOs in ScheduleDAGInstrs.cpp. It would be really nice, however, to support multiple MMOs, so work here would certainly be appreciated. -Hal> > Chad > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
Chad Rosier
2015-Feb-19 19:02 UTC
[LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid
> ----- Original Message ----- >> From: "Chad Rosier" <mcrosier at codeaurora.org> >> To: llvmdev at cs.uiuc.edu >> Sent: Thursday, February 19, 2015 12:06:10 PM >> Subject: [LLVMdev] ScheduleDAGInstrs computes deps using IR Values that >> may be invalid >> >> Hi All, >> I've encountered an issue where tail merging MIs is causing a problem >> with >> the post-RA MI scheduler dependency analysis and I'm not sure of the >> best >> way to address the problem. >> >> In my case, the branch folding pass (lib/CodeGen/BranchFolding.cpp) >> is >> merging common code from BB#14 and BB#15 into BB#16. It's clear that >> there are 4 common instructions (marked with an *) in BB#14 and >> BB#15: >> >> -------------------------------------------------------------------------------- >> BB#14: derived from LLVM BB %if.end.1 >> Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 >> %X1 >> %X10 %X11 %X12 %X13 %X9 >> Predecessors according to CFG: BB#12 >> %X5<def> = ADDXrr %X16, %X13 >> * %W19<def> = LDRBBui %X5, 1; >> mem:LD1[%scevgep95](tbaa=<0x6e02518>) >> * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR >> * %W2<def> = SUBWrr %W3<kill>, %W19<kill> >> * STRBBui %W2<kill>, %X5<kill>, 1; >> mem:ST1[%scevgep95](tbaa=<0x6e02518>) >> Successors according to CFG: BB#16 >> >> BB#15: derived from LLVM BB %L20.1 >> Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 >> %X1 >> %X10 %X11 %X12 %X13 %X9 >> Predecessors according to CFG: BB#12 >> %X5<def> = ADDXrr %X17, %X13 >> * %W19<def> = LDRBBui %X5, 1; >> mem:LD1[%scevgep91](tbaa=<0x6e02518>) >> * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR >> * %W2<def> = SUBWrr %W3<kill>, %W19<kill> >> * STRBBui %W2<kill>, %X5<kill>, 1; >> mem:ST1[%scevgep91](tbaa=<0x6e02518>) >> Successors according to CFG: BB#16 >> -------------------------------------------------------------------------------- >> >> The pass splits the edges between BB#14/BB#15 and inserts the common >> code >> into a new common successor BB#17. BB#17 is then merged into BB#16: >> >> -------------------------------------------------------------------------------- >> BB#16: derived from LLVM BB %L30.1 >> Predecessors according to CFG: BB#15 BB#14 >> %W19<def> = LDRBBui %X5, 1; >> mem:LD1[%scevgep91](tbaa=<0x6e02518>) >> (BB#15) >> %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR >> (BB#15) >> %W2<def> = SUBWrr %W3<kill>, %W19<kill> >> (BB#15) >> STRBBui %W2<kill>, %X5<kill>, 1; >> mem:ST1[%scevgep91](tbaa=<0x6e02518>) (BB#15) >> >> %W7<def> = LDRBBui %X7<kill>, 1; >> mem:LD1[%scevgep99](tbaa=<0x6e02518>) >> %W0<def> = LDRSBWui %X0<kill>, 1; >> mem:LD1[%scevgep101](tbaa=<0x6e02518>) >> %W6<def> = LDRBBui %X6<kill>, 1; >> mem:LD1[%scevgep103](tbaa=<0x6e02518>) >> %W5<def> = MADDWrrr %W6<kill>, %W0<kill>, %W7<kill> >> %X9<def> = ADDXri %X9<kill>, 2, 0 >> %X13<def> = ADDXri %X13<kill>, 2, 0 >> %WZR<def,dead> = SUBSWrr %W4, %W8, %NZCV<imp-def>, >> %X4<imp-use,kill> >> STRBBui %W5<kill>, %X18<kill>, 1; >> mem:ST1[%31](tbaa=<0x6e02518>) >> Bcc 1, <BB#9>, %NZCV<imp-use> >> Successors according to CFG: BB#13(4) BB#9(124) >> -------------------------------------------------------------------------------- >> >> This transformation looks fine, IMO. However, notice that the merged >> instructions are derived from BB#15. The memory operations in BB#15 >> were >> accessing array 'C', while the memory operations in BB#14 were >> accessing >> array 'B'. >> >> The next two loads >> >> %W7<def> = LDRBBui %X7<kill>, 1; >> mem:LD1[%scevgep99](tbaa=<0x6e02518>) >> %W0<def> = LDRSBWui %X0<kill>, 1; >> mem:LD1[%scevgep101](tbaa=<0x6e02518>) >> >> load from array 'B' and 'C', respectively. The problem occurs >> because >> ScheduleDAGInstrs does not correctly identify the true dependency >> between >> the >> >> STRBBui %W2<kill>, %X5<kill>, 1; >> mem:ST1[%scevgep91](tbaa=<0x6e02518>) >> >> and >> >> %W0<def> = LDRSBWui %X0<kill>, 1; >> mem:LD1[%scevgep101](tbaa=<0x6e02518>) >> >> instructions. AFAICT, the problem is that the Values associated with >> the >> MI's memory operands are used to determine the underlying objects >> accessed >> by a memory operation (See getUnderlyingObjectsForInstr() in >> ScheduleDAGInstrs.cpp). In turn the dependency analysis relies on >> these >> Values. >> >> When the STRBBui is merged into BB#16 it is derived from BB#15. >> Thus, the >> STRBBui associated Value is 'C'. The dependency analysis detects the >> dependency on array 'C', but misses the dependency on 'B'. Later, >> the >> scheduler hoists the LDRSBWui above the STRBBui and violates a true >> dependency. Boo.. >> >> Assuming my analysis is correct, the first solution that comes to >> mind is >> to disable tail merging when we encounter a memory operation (unless >> they >> access the same objects). Thoughts? >> >> I'm not particularly fond of inspecting the Values associated with MI >> operands for dependency analysis, but it's what we have now.. >> >> I would like to know everyones thoughts. Is there a more robust >> solution >> I'm missing? What happens if another MI-level pass performs a >> similar >> type of merging? > > You can also remove the MMOs when you tail merge, or add multiple MMOs. > Currently the two are equivalent because we ignore all MMOs on > instructions with multiple MMOs in ScheduleDAGInstrs.cpp. It would be > really nice, however, to support multiple MMOs, so work here would > certainly be appreciated.Thanks for the advice, Hal. That sounds like a much better solution now that I better understand how ScheduleDAGInstrs.cpp uses MMOs.> > -Hal > >> >> Chad >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory >