On 1/7/2013 2:15 PM, Xu Liu wrote:> > This would be ideal. How can I do the instrumentation pass after the > instruction scheduling?You could derive your own class from TargetPassConfig, and add the annotation pass in YourDerivedTargetPassConfig::addPreEmitPass. This will add your annotation pass very late, just before the final code is emitted. If you're using the X86 target, then the class and the function is already there: lib/Target/X86/X86TargetMachine.cpp: bool X86PassConfig::addPreEmitPass() { bool ShouldPrint = false; if (getOptLevel() != CodeGenOpt::None && getX86Subtarget().hasSSE2()) { addPass(createExecutionDependencyFixPass(&X86::VR128RegClass)); ShouldPrint = true; } if (getX86Subtarget().hasAVX() && UseVZeroUpper) { addPass(createX86IssueVZeroUpperPass()); ShouldPrint = true; } return ShouldPrint; } -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Liu, This is likely a better solution for you - you do not want to mess with the scheduler unless you really have to ;) Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation> -----Original Message----- > From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Krzysztof Parzyszek > Sent: Monday, January 07, 2013 2:26 PM > To: Xu Liu > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] instruction scheduling issue > > On 1/7/2013 2:15 PM, Xu Liu wrote: > > > > This would be ideal. How can I do the instrumentation pass after the > > instruction scheduling? > > You could derive your own class from TargetPassConfig, and add the > annotation pass in YourDerivedTargetPassConfig::addPreEmitPass. This > will add your annotation pass very late, just before the final code is > emitted. If you're using the X86 target, then the class and the > function is already there: > > lib/Target/X86/X86TargetMachine.cpp: > > bool X86PassConfig::addPreEmitPass() { > bool ShouldPrint = false; > if (getOptLevel() != CodeGenOpt::None && > getX86Subtarget().hasSSE2()) { > addPass(createExecutionDependencyFixPass(&X86::VR128RegClass)); > ShouldPrint = true; > } > > if (getX86Subtarget().hasAVX() && UseVZeroUpper) { > addPass(createX86IssueVZeroUpperPass()); > ShouldPrint = true; > } > > return ShouldPrint; > } > > > > -Krzysztof > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On 1/7/2013 2:25 PM, Krzysztof Parzyszek wrote:> On 1/7/2013 2:15 PM, Xu Liu wrote: >> >> This would be ideal. How can I do the instrumentation pass after the >> instruction scheduling? > > You could derive your own class from TargetPassConfig, and add the > annotation pass in YourDerivedTargetPassConfig::addPreEmitPass.If you need your pass to run before register allocation, you can use function addPreRegAlloc in the same way. The only problem will be that there is another scheduling pass that runs after register allocation, but you should be able to disable it. -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Krzysztof, Thanks for your helpful answers. Xu Quoting Krzysztof Parzyszek <kparzysz at codeaurora.org>:> On 1/7/2013 2:15 PM, Xu Liu wrote: >> >> This would be ideal. How can I do the instrumentation pass after the >> instruction scheduling? > > You could derive your own class from TargetPassConfig, and add the > annotation pass in YourDerivedTargetPassConfig::addPreEmitPass. > This will add your annotation pass very late, just before the final > code is emitted. If you're using the X86 target, then the class and > the function is already there: > > lib/Target/X86/X86TargetMachine.cpp: > > bool X86PassConfig::addPreEmitPass() { > bool ShouldPrint = false; > if (getOptLevel() != CodeGenOpt::None && getX86Subtarget().hasSSE2()) { > addPass(createExecutionDependencyFixPass(&X86::VR128RegClass)); > ShouldPrint = true; > } > > if (getX86Subtarget().hasAVX() && UseVZeroUpper) { > addPass(createX86IssueVZeroUpperPass()); > ShouldPrint = true; > } > > return ShouldPrint; > } > > > > -Krzysztof > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > hosted by The Linux Foundation > >
Hello everybody, I have a case of suspected indeterminism and I would like to verify that it is not a known issue before I dig deep into it. It seems to happen during PreVerifier pass ("Preliminary module verification"). The little I understand/assume about it, a verifier pass is not supposed to change the code (or is it?) but in debug stream I see the following: Common predecessor: *** IR Dump After Loop-Closed SSA Form Pass *** for.body.us68: ; preds %for.body.lr.ph.us81, %for.body.us68 %arrayidx.us70.phi = phi i8* [ %buf.0.ph, %for.body.lr.ph.us81 ], [ %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ %add.ptr4.us72.gep, %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, %for.body.us68 ] %i.043.us69 = phi i32 [ 0, %for.body.lr.ph.us81 ], [ %inc.us73, %for.body.us68 ] ... LV: Found a vectorizable loop (8) in core_state.i LV: Adding RT check for range: %add.ptr4.us72.phi = phi i8* [ %add.ptr4.us72.gep, %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, %for.body.us68 ] LV: Adding RT check for range: %arrayidx.us70.phi = phi i8* [ %buf.0.ph, %for.body.lr.ph.us81 ], [ %arrayidx.us70.inc, %for.body.us68 ] Then there are two possible outcomes triggered by a code change in completely unrelated portion of the code and rebuild: *** IR Dump After Preliminary module verification *** First version: for.body.us68: ; preds = %scalar.ph, %for.body.us68 %arrayidx.us70.phi = phi i8* [ %resume.val200, %scalar.ph ], [ %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ %resume.val, %scalar.ph ], [ %add.ptr4.us72.inc, %for.body.us68 ] Second version: for.body.us68: ; preds = %scalar.ph, %for.body.us68 %arrayidx.us70.phi = phi i8* [ %resume.val, %scalar.ph ], [ %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ %resume.val200, %scalar.ph ], [ %add.ptr4.us72.inc, %for.body.us68 ] This difference snowballs there after causing different instruction order and ultimately a different code. If it rings the bell for anyone, or it is a known issue, please let me know. Thanks. Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation