Chawla, Pankaj via llvm-dev
2017-Apr-14 23:55 UTC
[llvm-dev] Question on induction variable simplification pass
Hi Sanjoy, I have attached the IR I got by compiling with -O2. This is just before we widen the IV. To get the backedge taken count info I ran indvars on it and then replaced zext with sext. I think regardless of where we decide to add this transformation in the pipeline, it should try to preserve as much information as it can. This means that we should generate sext for signed IVs and vice-versa. I believe this is a better approach as it preserves the information directly in the IR as opposed to relying on ScalarEvolution to deduce it. Moving it to a different location can be done separately. Do you agree? Thanks, Pankaj -----Original Message----- From: Sanjoy Das [mailto:sanjoy at playingwithpointers.com] Sent: Thursday, April 13, 2017 8:35 PM To: llvm-dev at lists.llvm.org; Chawla, Pankaj Subject: Re: [llvm-dev] Question on induction variable simplification pass Hi Pankaj, On April 12, 2017 at 5:22:45 PM, Chawla, Pankaj via llvm-dev (llvm-dev at lists.llvm.org) wrote:> It looks like the induction variable simplification pass prefers doing > a zero-extension to compute the wider trip count of loops when > extending the IV. This can sometimes result in loss of information > making ScalarEvolution's analysis conservative which can lead to missed performance opportunities. > > For example, consider this loopnest- > > int i, j; > for(i=0; i< 40; i++) > for(j=0; j< i-1; j++) > A[i+j] = j; > > > We are mainly interested in the backedge taken count of the inner loop. > > Before indvars, the backedge information computed by ScalarEvolution > is as follows- > > Outer loop- > backedge-taken count is 39 > max backedge-taken count is 39 > > Inner loop- > backedge-taken count is {-2,+,1}<%for.cond1.preheader> max > backedge-taken count is 37 > > > After indvars, the backedge information computed by ScalarEvolution is > as follows- > > Outer loop- > backedge-taken count is 39 > max backedge-taken count is 39 > > Inner loop- > backedge-taken count is (-1 + (zext i32 {-1,+,1}<%for.cond1.preheader> > to i64)) max backedge-taken count is -1One approach is to use the facts: - The inner loop will not be entered in the 0th iteration of <%for.cond1.preheader> - {-1,+,1}<%for.cond1.preheader> is s< 40 to simplify the above to {-2,+,1}<%for.cond1.preheader> (in i64). The original expression was not -2 in the 0th iteration of <%for.cond1.preheader>, but we don't care about that iteration of <%for.cond1.preheader> since we won't enter the inner loop. The other option is to widen of IVs late, as a "lowering" transformation, like Hal said. That's a more invasive change, but if you have time and resources, it would be nice to at least give it a shot, measure and see what falls over.> If we generate sext instead of zext, the information computed by > ScalarEvolution is as follows- > > Outer loop- > backedge-taken count is 39 > max backedge-taken count is 39 > > Inner loop- > backedge-taken count is {-2,+,1}<%for.cond1.preheader> max > backedge-taken count is -2 > > We now have a simplified backedge taken count for the inner loop which > is almost as precise as before indvars, except for the flag instead of > flag. I think ScalarEvolution(JFYI: My mail client's compose ate the <nsw> and <nw>) Can you please share the IR that you piped through SCEV? My guess is that SCEV did not "try to" infer a more aggressive no-wrap flag for {-2,+,1} -- most of the no-wrap inferring logic kicks in when you try to sign/zero extend an add recurrence. One suspicious bit here is the "max backedge-taken count is -2" bit. I expected SCEV to have inferred the max trip count to be 37 in this second case. -- Sanjoy -------------- next part -------------- A non-text attachment was scrubbed... Name: indvars.ll Type: application/octet-stream Size: 1435 bytes Desc: indvars.ll URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170414/9273ec1d/attachment.obj>
Sanjoy Das via llvm-dev
2017-Apr-17 04:29 UTC
[llvm-dev] Question on induction variable simplification pass
Hi Pankaj, On April 14, 2017 at 4:55:16 PM, Chawla, Pankaj (pankaj.chawla at intel.com) wrote:> I have attached the IR I got by compiling with -O2. This is just before we widen the IV.Thanks!> To get the backedge taken count info I ran indvars on it and then replaced zext with sext. > > I think regardless of where we decide to add this transformation in the pipeline, it should > try to preserve as much information as it can. This means that we should generate sext > for signed IVs and vice-versa. I believe this is a better approach as it preserves the > information directly in the IR as opposed to relying on ScalarEvolution to deduce it.I'll be happy to review patches making indvars behave better here (i.e. not "break" loop trip counts like this). I don't think the IV is the most relevant bit here though -- it looks like (only a guess) indvars is faltering here: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/IndVarSimplify.cpp#L2240 and that logic needs to be made smarter to account for how much the RHS of the LFTR'ed exit condition is simplified after extension.> Moving it to a different location can be done separately. > > Do you agree?Sounds good! Thanks! -- Sanjoy
Sanjoy Das via llvm-dev
2017-Apr-17 04:29 UTC
[llvm-dev] Question on induction variable simplification pass
Hi Pankaj, On April 14, 2017 at 4:55:16 PM, Chawla, Pankaj (pankaj.chawla at intel.com) wrote:> I have attached the IR I got by compiling with -O2. This is just before we widen the IV.Thanks!> To get the backedge taken count info I ran indvars on it and then replaced zext with sext. > > I think regardless of where we decide to add this transformation in the pipeline, it should > try to preserve as much information as it can. This means that we should generate sext > for signed IVs and vice-versa. I believe this is a better approach as it preserves the > information directly in the IR as opposed to relying on ScalarEvolution to deduce it.I'll be happy to review patches making indvars behave better here (i.e. not "break" loop trip counts like this). I don't think the IV is the most relevant bit here though -- it looks like (only a guess) indvars is faltering here: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/IndVarSimplify.cpp#L2240 and that logic needs to be made smarter to account for how much the RHS of the LFTR'ed exit condition is simplified after extension.> Moving it to a different location can be done separately. > > Do you agree?Sounds good! Thanks! -- Sanjoy
Sanjoy Das via llvm-dev
2017-Apr-17 04:31 UTC
[llvm-dev] Question on induction variable simplification pass
Hi Pankaj, On April 14, 2017 at 4:55:16 PM, Chawla, Pankaj (pankaj.chawla at intel.com) wrote:> I have attached the IR I got by compiling with -O2. This is just before we widen the IV.Thanks!> To get the backedge taken count info I ran indvars on it and then replaced zext with sext. > > I think regardless of where we decide to add this transformation in the pipeline, it should > try to preserve as much information as it can. This means that we should generate sext > for signed IVs and vice-versa. I believe this is a better approach as it preserves the > information directly in the IR as opposed to relying on ScalarEvolution to deduce it.I'll be happy to review patches making indvars behave better here (i.e. not "break" loop trip counts like this). I don't think the IV is the most relevant bit here though -- it looks like (only a guess) indvars is faltering here: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Scalar/IndVarSimplify.cpp#L2240 and that logic needs to be made smarter to account for how much the RHS of the LFTR'ed exit condition is simplified after extension.> Moving it to a different location can be done separately. > > Do you agree?Sounds good! -- Sanjoy