Hi Sergei, "addRuntimeCheck" inserts code that checks that two or more arrays are disjoint. I looked at the code and it looks fine. We generate PHIs in the order that they appear in a vector. The values are inserted in 'canVectorizeMemory', which also looks fine. Please let me know if you think I missed something. Thanks, Nadav On Jan 29, 2013, at 8:48 AM, Sergei Larin <slarin at codeaurora.org> wrote:> Nadav, > > As I peel this onion, it looks like you might know something about > InnerLoopVectorizer::addRuntimeCheck. > What does it do, and can it be causing the below described issue? Could > resuming somehow (indeterministically) switch the order of PHIs in the > original code? > > Thanks a lot. > > Sergei. > > --- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by > The Linux Foundation > > >> -----Original Message----- >> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] >> On Behalf Of Sergei Larin >> Sent: Tuesday, January 29, 2013 10:31 AM >> To: llvmdev at cs.uiuc.edu >> Subject: [LLVMdev] Apparent indeterminism in PreVerifier >> >> >> Hello everybody, >> >> >> I have a case of suspected indeterminism and I would like to verify >> that it is not a known issue before I dig deep into it. >> It seems to happen during PreVerifier pass ("Preliminary module >> verification"). The little I understand/assume about it, a verifier >> pass is not supposed to change the code (or is it?) but in debug stream >> I see the >> following: >> >> Common predecessor: >> >> *** IR Dump After Loop-Closed SSA Form Pass *** >> for.body.us68: ; preds >> %for.body.lr.ph.us81, %for.body.us68 >> %arrayidx.us70.phi = phi i8* [ %buf.0.ph, %for.body.lr.ph.us81 ], [ >> %arrayidx.us70.inc, %for.body.us68 ] >> %add.ptr4.us72.phi = phi i8* [ %add.ptr4.us72.gep, >> %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, %for.body.us68 ] >> %i.043.us69 = phi i32 [ 0, %for.body.lr.ph.us81 ], [ %inc.us73, >> %for.body.us68 ] >> ... >> >> LV: Found a vectorizable loop (8) in core_state.i >> LV: Adding RT check for range: %add.ptr4.us72.phi = phi i8* [ >> %add.ptr4.us72.gep, %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, >> %for.body.us68 ] >> LV: Adding RT check for range: %arrayidx.us70.phi = phi i8* [ >> %buf.0.ph, >> %for.body.lr.ph.us81 ], [ %arrayidx.us70.inc, %for.body.us68 ] >> >> >> >> Then there are two possible outcomes triggered by a code change in >> completely unrelated portion of the code and rebuild: >> >> *** IR Dump After Preliminary module verification *** >> >> First version: >> >> for.body.us68: ; preds = %scalar.ph, >> %for.body.us68 >> %arrayidx.us70.phi = phi i8* [ %resume.val200, %scalar.ph ], [ >> %arrayidx.us70.inc, %for.body.us68 ] >> %add.ptr4.us72.phi = phi i8* [ %resume.val, %scalar.ph ], [ >> %add.ptr4.us72.inc, %for.body.us68 ] >> >> Second version: >> >> for.body.us68: ; preds = %scalar.ph, >> %for.body.us68 >> %arrayidx.us70.phi = phi i8* [ %resume.val, %scalar.ph ], [ >> %arrayidx.us70.inc, %for.body.us68 ] >> %add.ptr4.us72.phi = phi i8* [ %resume.val200, %scalar.ph ], [ >> %add.ptr4.us72.inc, %for.body.us68 ] >> >> This difference snowballs there after causing different instruction >> order and ultimately a different code. >> >> If it rings the bell for anyone, or it is a known issue, please let me >> know. >> >> Thanks. >> >> Sergei >> >> >> --- >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >> hosted by The Linux Foundation >> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Nadav, Thanks for the quick response. By now I am convinced that the given loop ends up vectorized with enough difference to cause bad things later on, but I have not found the exact cause yet. To continue with my work I'll have to simply turn off vectorization for now, but I will come back and investigate. Again, there is some indeterminism in order of PHIs processing somewhere. I'll keep you posted. Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation> -----Original Message----- > From: Nadav Rotem [mailto:nrotem at apple.com] > Sent: Tuesday, January 29, 2013 11:15 AM > To: Sergei Larin > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Apparent indeterminism in PreVerifier > > Hi Sergei, > > "addRuntimeCheck" inserts code that checks that two or more arrays are > disjoint. I looked at the code and it looks fine. We generate PHIs in > the order that they appear in a vector. The values are inserted in > 'canVectorizeMemory', which also looks fine. Please let me know if you > think I missed something. > > Thanks, > Nadav > > On Jan 29, 2013, at 8:48 AM, Sergei Larin <slarin at codeaurora.org> > wrote: > > > Nadav, > > > > As I peel this onion, it looks like you might know something about > > InnerLoopVectorizer::addRuntimeCheck. > > What does it do, and can it be causing the below described issue? > > Could resuming somehow (indeterministically) switch the order of PHIs > > in the original code? > > > > Thanks a lot. > > > > Sergei. > > > > --- > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > > hosted by The Linux Foundation > > > > > >> -----Original Message----- > >> From: llvmdev-bounces at cs.uiuc.edu > >> [mailto:llvmdev-bounces at cs.uiuc.edu] > >> On Behalf Of Sergei Larin > >> Sent: Tuesday, January 29, 2013 10:31 AM > >> To: llvmdev at cs.uiuc.edu > >> Subject: [LLVMdev] Apparent indeterminism in PreVerifier > >> > >> > >> Hello everybody, > >> > >> > >> I have a case of suspected indeterminism and I would like to verify > >> that it is not a known issue before I dig deep into it. > >> It seems to happen during PreVerifier pass ("Preliminary module > >> verification"). The little I understand/assume about it, a verifier > >> pass is not supposed to change the code (or is it?) but in debug > >> stream I see the > >> following: > >> > >> Common predecessor: > >> > >> *** IR Dump After Loop-Closed SSA Form Pass *** > >> for.body.us68: ; preds > >> %for.body.lr.ph.us81, %for.body.us68 > >> %arrayidx.us70.phi = phi i8* [ %buf.0.ph, %for.body.lr.ph.us81 ], [ > >> %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ > >> %add.ptr4.us72.gep, > >> %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, %for.body.us68 ] > >> %i.043.us69 = phi i32 [ 0, %for.body.lr.ph.us81 ], [ %inc.us73, > >> %for.body.us68 ] > >> ... > >> > >> LV: Found a vectorizable loop (8) in core_state.i > >> LV: Adding RT check for range: %add.ptr4.us72.phi = phi i8* [ > >> %add.ptr4.us72.gep, %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, > >> %for.body.us68 ] > >> LV: Adding RT check for range: %arrayidx.us70.phi = phi i8* [ > >> %buf.0.ph, > >> %for.body.lr.ph.us81 ], [ %arrayidx.us70.inc, %for.body.us68 ] > >> > >> > >> > >> Then there are two possible outcomes triggered by a code change in > >> completely unrelated portion of the code and rebuild: > >> > >> *** IR Dump After Preliminary module verification *** > >> > >> First version: > >> > >> for.body.us68: ; preds > %scalar.ph, > >> %for.body.us68 > >> %arrayidx.us70.phi = phi i8* [ %resume.val200, %scalar.ph ], [ > >> %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ > >> %resume.val, %scalar.ph ], [ %add.ptr4.us72.inc, %for.body.us68 ] > >> > >> Second version: > >> > >> for.body.us68: ; preds > %scalar.ph, > >> %for.body.us68 > >> %arrayidx.us70.phi = phi i8* [ %resume.val, %scalar.ph ], [ > >> %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ > >> %resume.val200, %scalar.ph ], [ %add.ptr4.us72.inc, %for.body.us68 ] > >> > >> This difference snowballs there after causing different instruction > >> order and ultimately a different code. > >> > >> If it rings the bell for anyone, or it is a known issue, please let > >> me know. > >> > >> Thanks. > >> > >> Sergei > >> > >> > >> --- > >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, > >> hosted by The Linux Foundation > >> > >> > >> > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >
Is there a test case that you can share ? On Jan 29, 2013, at 9:24 AM, Sergei Larin <slarin at codeaurora.org> wrote:> Nadav, > > Thanks for the quick response. By now I am convinced that the given loop > ends up vectorized with enough difference to cause bad things later on, but > I have not found the exact cause yet. To continue with my work I'll have to > simply turn off vectorization for now, but I will come back and investigate. > Again, there is some indeterminism in order of PHIs processing somewhere. > I'll keep you posted. > > Sergei > > --- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by > The Linux Foundation > > >> -----Original Message----- >> From: Nadav Rotem [mailto:nrotem at apple.com] >> Sent: Tuesday, January 29, 2013 11:15 AM >> To: Sergei Larin >> Cc: llvmdev at cs.uiuc.edu >> Subject: Re: [LLVMdev] Apparent indeterminism in PreVerifier >> >> Hi Sergei, >> >> "addRuntimeCheck" inserts code that checks that two or more arrays are >> disjoint. I looked at the code and it looks fine. We generate PHIs in >> the order that they appear in a vector. The values are inserted in >> 'canVectorizeMemory', which also looks fine. Please let me know if you >> think I missed something. >> >> Thanks, >> Nadav >> >> On Jan 29, 2013, at 8:48 AM, Sergei Larin <slarin at codeaurora.org> >> wrote: >> >>> Nadav, >>> >>> As I peel this onion, it looks like you might know something about >>> InnerLoopVectorizer::addRuntimeCheck. >>> What does it do, and can it be causing the below described issue? >>> Could resuming somehow (indeterministically) switch the order of PHIs >>> in the original code? >>> >>> Thanks a lot. >>> >>> Sergei. >>> >>> --- >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >>> hosted by The Linux Foundation >>> >>> >>>> -----Original Message----- >>>> From: llvmdev-bounces at cs.uiuc.edu >>>> [mailto:llvmdev-bounces at cs.uiuc.edu] >>>> On Behalf Of Sergei Larin >>>> Sent: Tuesday, January 29, 2013 10:31 AM >>>> To: llvmdev at cs.uiuc.edu >>>> Subject: [LLVMdev] Apparent indeterminism in PreVerifier >>>> >>>> >>>> Hello everybody, >>>> >>>> >>>> I have a case of suspected indeterminism and I would like to verify >>>> that it is not a known issue before I dig deep into it. >>>> It seems to happen during PreVerifier pass ("Preliminary module >>>> verification"). The little I understand/assume about it, a verifier >>>> pass is not supposed to change the code (or is it?) but in debug >>>> stream I see the >>>> following: >>>> >>>> Common predecessor: >>>> >>>> *** IR Dump After Loop-Closed SSA Form Pass *** >>>> for.body.us68: ; preds >>>> %for.body.lr.ph.us81, %for.body.us68 >>>> %arrayidx.us70.phi = phi i8* [ %buf.0.ph, %for.body.lr.ph.us81 ], [ >>>> %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ >>>> %add.ptr4.us72.gep, >>>> %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, %for.body.us68 ] >>>> %i.043.us69 = phi i32 [ 0, %for.body.lr.ph.us81 ], [ %inc.us73, >>>> %for.body.us68 ] >>>> ... >>>> >>>> LV: Found a vectorizable loop (8) in core_state.i >>>> LV: Adding RT check for range: %add.ptr4.us72.phi = phi i8* [ >>>> %add.ptr4.us72.gep, %for.body.lr.ph.us81 ], [ %add.ptr4.us72.inc, >>>> %for.body.us68 ] >>>> LV: Adding RT check for range: %arrayidx.us70.phi = phi i8* [ >>>> %buf.0.ph, >>>> %for.body.lr.ph.us81 ], [ %arrayidx.us70.inc, %for.body.us68 ] >>>> >>>> >>>> >>>> Then there are two possible outcomes triggered by a code change in >>>> completely unrelated portion of the code and rebuild: >>>> >>>> *** IR Dump After Preliminary module verification *** >>>> >>>> First version: >>>> >>>> for.body.us68: ; preds >> %scalar.ph, >>>> %for.body.us68 >>>> %arrayidx.us70.phi = phi i8* [ %resume.val200, %scalar.ph ], [ >>>> %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ >>>> %resume.val, %scalar.ph ], [ %add.ptr4.us72.inc, %for.body.us68 ] >>>> >>>> Second version: >>>> >>>> for.body.us68: ; preds >> %scalar.ph, >>>> %for.body.us68 >>>> %arrayidx.us70.phi = phi i8* [ %resume.val, %scalar.ph ], [ >>>> %arrayidx.us70.inc, %for.body.us68 ] %add.ptr4.us72.phi = phi i8* [ >>>> %resume.val200, %scalar.ph ], [ %add.ptr4.us72.inc, %for.body.us68 ] >>>> >>>> This difference snowballs there after causing different instruction >>>> order and ultimately a different code. >>>> >>>> If it rings the bell for anyone, or it is a known issue, please let >>>> me know. >>>> >>>> Thanks. >>>> >>>> Sergei >>>> >>>> >>>> --- >>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >>>> hosted by The Linux Foundation >>>> >>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> > >