Alex Susu via llvm-dev
2017-Feb-09 22:46 UTC
[llvm-dev] Specify special cases of delay slots in the back end
Hello. Hal, thank you for the information. I managed to get inspired from PPCHazardRecognizers.cpp. So I created my very simple [Target]HazardRecognizers.cpp pass that is also derived from ScoreboardHazardRecognizer. My class only implements the method getHazardType(), which checks if, as stated in my first email, for example, I have a store instruction that is storing the value updated by the instruction immediately above, which is NOT ok, since for my processor this is a data hazard and in this case I have to insert a NOP in between by making getHazardType() to: return NoopHazard; // this basically emits noop However, to my surprise, my very simple post-RA scheduler (using my class derived from ScoreboardHazardRecognizer) is cycling FOREVER after this return NoopHazard, by calling getHazardType() again and again for this SAME store instruction I found in the first place with the data hazard problem. So, llc is no longer finishing - I have to stop the process because of this strange behavior. I was expecting after the first call to getHazardType() with the respective store instruction (and return NoopHazard) that the scheduler would move forward to the other instructions in the DAG/basic-block. Do you have an idea what can I do to fix this problem? Thank you very much, Alex On 2/3/2017 10:25 PM, Hal Finkel wrote:> Hi Alex, > > You can program a post-RA scheduler which will return NoopHazard in the appropriate > circumstances. You can look at the PowerPC target (e.g. > lib/Target/PowerPC/PPCHazardRecognizers.cpp) as an example. > > -Hal > > > On 02/02/2017 05:03 PM, Alex Susu via llvm-dev wrote: >> Hello. >> I see there is little information on specifying instructions with delay slots. >> So could you please tell me how can I insert NOPs (BEFORE or after an instruction) >> or how to make an aware instruction scheduler in order to avoid miscalculations due to >> the delay slot effect? >> >> More exactly, I have the following constraints on my (SIMD) processor: >> - certain stores or loads, must be executed 1 cycle after the instruction >> generating their input operands ends. For example, if I have: >> R1 = R2 + R3 >> LS[R10] = R1 // this will not produce the correct result because it does not >> see the updated value of R1 from the previous instruction >> To make this code execute correctly we need to insert a NOP: >> R1 = R2 + R3 >> NOP // or other instruction to fill the delay slot >> LS[R10] = R1 >> >> - a compare instruction requires to add a NOP after it, before the predicated >> block (something like a conditional JMP instruction) starts. >> >> >> Thank you, >> Alex >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Hal Finkel via llvm-dev
2017-Feb-09 23:42 UTC
[llvm-dev] Specify special cases of delay slots in the back end
On 02/09/2017 04:46 PM, Alex Susu via llvm-dev wrote:> Hello. > Hal, thank you for the information. > I managed to get inspired from PPCHazardRecognizers.cpp. So I > created my very simple [Target]HazardRecognizers.cpp pass that is also > derived from ScoreboardHazardRecognizer. My class only implements the > method getHazardType(), which checks if, as stated in my first email, > for example, I have a store instruction that is storing the value > updated by the instruction immediately above, which is NOT ok, since > for my processor this is a data hazard and in this case I have to > insert a NOP in between by making getHazardType() to: > return NoopHazard; // this basically emits noop > > However, to my surprise, my very simple post-RA scheduler (using > my class derived from ScoreboardHazardRecognizer) is cycling FOREVER > after this return NoopHazard, by calling getHazardType() again and > again for this SAME store instruction I found in the first place with > the data hazard problem. So, llc is no longer finishing - I have to > stop the process because of this strange behavior. > I was expecting after the first call to getHazardType() with the > respective store instruction (and return NoopHazard) that the > scheduler would move forward to the other instructions in the > DAG/basic-block.It should emit a nop if all available instructions return NoopHazard.> > Do you have an idea what can I do to fix this problem?I'm not sure. I recall running into a situation like this years ago, but I don't recall now how I resolved it. Are you correctly handling the Stalls argument to getHazardType? -Hal> > Thank you very much, > Alex > > On 2/3/2017 10:25 PM, Hal Finkel wrote: >> Hi Alex, >> >> You can program a post-RA scheduler which will return NoopHazard in >> the appropriate >> circumstances. You can look at the PowerPC target (e.g. >> lib/Target/PowerPC/PPCHazardRecognizers.cpp) as an example. >> >> -Hal >> >> >> On 02/02/2017 05:03 PM, Alex Susu via llvm-dev wrote: >>> Hello. >>> I see there is little information on specifying instructions >>> with delay slots. >>> So could you please tell me how can I insert NOPs (BEFORE or >>> after an instruction) >>> or how to make an aware instruction scheduler in order to avoid >>> miscalculations due to >>> the delay slot effect? >>> >>> More exactly, I have the following constraints on my (SIMD) >>> processor: >>> - certain stores or loads, must be executed 1 cycle after the >>> instruction >>> generating their input operands ends. For example, if I have: >>> R1 = R2 + R3 >>> LS[R10] = R1 // this will not produce the correct result >>> because it does not >>> see the updated value of R1 from the previous instruction >>> To make this code execute correctly we need to insert a NOP: >>> R1 = R2 + R3 >>> NOP // or other instruction to fill the delay slot >>> LS[R10] = R1 >>> >>> - a compare instruction requires to add a NOP after it, before >>> the predicated >>> block (something like a conditional JMP instruction) starts. >>> >>> >>> Thank you, >>> Alex >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Hal Finkel Lead, Compiler Technology and Programming Languages Leadership Computing Facility Argonne National Laboratory
Alex Susu via llvm-dev
2017-Feb-10 20:42 UTC
[llvm-dev] Specify special cases of delay slots in the back end
Hello. I am progressing a bit with difficulty with the post RA scheduler (PostRASchedulerList.cpp with ScoreboardHazardRecognizer) - the problem I have is that it doesn't advance at the next available instruction when the overridden ScoreboardHazardRecognizer::getHazardType() method returns NoopHazard and it gets stuck at the same instruction (store in my runs). Just to make sure: I am trying to use the post-RA (Register Allocation) scheduler to avoid data hazards by inserting, if possible, other USEFUL instructions from the program instead of (just) NOPs. Is this out-of-order scheduling (e.g., using the ScoreboardHazardRecognizer) that employs useful program instructions instead of NOPs working well with the post-RA scheduler? Otherwise, if the post RA scheduler only inserts NOPs, since I have issues using it, I could as well insert NOPs in the [Target]AsmPrinter.cpp module . Thank you, Alex On 2/10/2017 1:42 AM, Hal Finkel wrote:> > On 02/09/2017 04:46 PM, Alex Susu via llvm-dev wrote: >> Hello. >> Hal, thank you for the information. >> I managed to get inspired from PPCHazardRecognizers.cpp. So I created my very simple >> [Target]HazardRecognizers.cpp pass that is also derived from ScoreboardHazardRecognizer. >> My class only implements the method getHazardType(), which checks if, as stated in my >> first email, for example, I have a store instruction that is storing the value updated >> by the instruction immediately above, which is NOT ok, since for my processor this is a >> data hazard and in this case I have to insert a NOP in between by making getHazardType() >> to: >> return NoopHazard; // this basically emits noop >> >> However, to my surprise, my very simple post-RA scheduler (using my class derived >> from ScoreboardHazardRecognizer) is cycling FOREVER after this return NoopHazard, by >> calling getHazardType() again and again for this SAME store instruction I found in the >> first place with the data hazard problem. So, llc is no longer finishing - I have to >> stop the process because of this strange behavior. >> I was expecting after the first call to getHazardType() with the respective store >> instruction (and return NoopHazard) that the scheduler would move forward to the other >> instructions in the DAG/basic-block. > > It should emit a nop if all available instructions return NoopHazard. > >> >> Do you have an idea what can I do to fix this problem? > > I'm not sure. I recall running into a situation like this years ago, but I don't recall > now how I resolved it. Are you correctly handling the Stalls argument to getHazardType? > > -Hal > >> >> Thank you very much, >> Alex >> >> On 2/3/2017 10:25 PM, Hal Finkel wrote: >>> Hi Alex, >>> >>> You can program a post-RA scheduler which will return NoopHazard in the appropriate >>> circumstances. You can look at the PowerPC target (e.g. >>> lib/Target/PowerPC/PPCHazardRecognizers.cpp) as an example. >>> >>> -Hal >>> >>> >>> On 02/02/2017 05:03 PM, Alex Susu via llvm-dev wrote: >>>> Hello. >>>> I see there is little information on specifying instructions with delay slots. >>>> So could you please tell me how can I insert NOPs (BEFORE or after an instruction) >>>> or how to make an aware instruction scheduler in order to avoid miscalculations due to >>>> the delay slot effect? >>>> >>>> More exactly, I have the following constraints on my (SIMD) processor: >>>> - certain stores or loads, must be executed 1 cycle after the instruction >>>> generating their input operands ends. For example, if I have: >>>> R1 = R2 + R3 >>>> LS[R10] = R1 // this will not produce the correct result because it does not >>>> see the updated value of R1 from the previous instruction >>>> To make this code execute correctly we need to insert a NOP: >>>> R1 = R2 + R3 >>>> NOP // or other instruction to fill the delay slot >>>> LS[R10] = R1 >>>> >>>> - a compare instruction requires to add a NOP after it, before the predicated >>>> block (something like a conditional JMP instruction) starts. >>>> >>>> >>>> Thank you, >>>> Alex >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Apparently Analagous Threads
- Specify special cases of delay slots in the back end
- Specify special cases of delay slots in the back end
- Pre-RA scheduler does not generate NOPs when getHazardType() returns NoopHazard
- Specify special cases of delay slots in the back end
- [LLVMdev] [llvm-commits] Bottom-Up Scheduling?