search for: lsr

Displaying 20 results from an estimated 371 matches for "lsr".

Did you mean: isr
2015 Sep 26
2
[RFC] New pass: LoopExitValues
Hi Steve, Do you primarily find this to help for nested loops? If so, that could be because LSR explicitly bails out of processing them: // Skip nested loops until we can model them better with formulae. if (!L->empty()) { DEBUG(dbgs() << "LSR skipping outer loop " << *L << "n"); return; } I don't know how much time you're...
2015 Sep 03
2
[RFC] New pass: LoopExitValues
.... You've mentioned matrix multiply - how does > this pass alter the IR? Here's before and after IR for the matrix_mul example. Notice the two bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V pass converts these to scevgep values that already exist. *** Code after LSR *** ; Function Attrs: nounwind optsize define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly %Src, i32 %Val) #0 { entry: %cmp.25 = icmp eq i32 %Size, 0 br i1 %cmp.25, label %for.cond.cleanup, label %for.body.4.lr.ph.preheader for.body.4.lr.ph.preheader:...
2020 Jun 10
2
LoopStrengthReduction generates false code
The IR after LSR is: *** IR Dump After Loop Strength Reduction *** ; Preheader: entry: tail call void @fill_array(i32* getelementptr inbounds ([10 x i32], [10 x i32]* @buffer, i32 0, i32 0)) #2 br label %while.body ; Loop: while.body: ; preds = %while.body, %entry %lsr....
2020 Jun 09
2
LoopStrengthReduction generates false code
.../final)"} >> !2 = !{!3, !3, i64 0} >> !3 = !{!"int", !4, i64 0} >> !4 = !{!"omnipotent char", !5, i64 0} >> !5 = !{!"Simple C/C++ TBAA"} >> >> >> (-debug-only=scalar-evolution,loop-reduce) for my arch: >> >> LSR on loop %while.body: >> Collecting IV Chains. >> IV Chain#0 Head: ( %0 = load i32, i32* %arrayidx, align 4, !tbaa !2) >> IV={@buffer,+,8}<nsw><%while.body> >> IV Chain#1 Head: ( %cmp11 = icmp eq i32 %i.010, 0) >> IV={0,+,1}<nuw><nsw><%while...
2020 Jun 09
2
LoopStrengthReduction generates false code
...quot;wchar_size", i32 4} !1 = !{!"clang version 7.0.1 (tags/RELEASE_701/final)"} !2 = !{!3, !3, i64 0} !3 = !{!"int", !4, i64 0} !4 = !{!"omnipotent char", !5, i64 0} !5 = !{!"Simple C/C++ TBAA"} (-debug-only=scalar-evolution,loop-reduce) for my arch: LSR on loop %while.body: Collecting IV Chains. IV Chain#0 Head: ( %0 = load i32, i32* %arrayidx, align 4, !tbaa !2) IV={@buffer,+,8}<nsw><%while.body> IV Chain#1 Head: ( %cmp11 = icmp eq i32 %i.010, 0) IV={0,+,1}<nuw><nsw><%while.body> IV Chain#1 Inc: ( %i.010 = phi i3...
2015 Sep 10
2
[RFC] New pass: LoopExitValues
...s pass alter the IR? > > > > Here's before and after IR for the matrix_mul example. Notice the two > > bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V > > pass converts these to scevgep values that already exist. > > > > *** Code after LSR *** > > > > ; Function Attrs: nounwind optsize > > define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture > > readonly %Src, i32 %Val) #0 { > > entry: > > %cmp.25 = icmp eq i32 %Size, 0 > > br i1 %cmp.25, label %for.cond.cleanup, label &...
2015 Sep 23
3
[RFC] New pass: LoopExitValues
On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> Should we try the patch in it's current location, namely after LSR? > > Sure; post the patch as you have it so we can look at what's going on. > http://reviews.llvm.org/D12494 One particular point: The algorithm checks that SCEV's are equal when their raw pointers are equal. Is that a future-proof feature of SCEVs?
2015 Sep 01
2
[RFC] New pass: LoopExitValues
On Mon, Aug 31, 2015 at 5:52 PM, Jake VanAdrighem <jvanadrighem at gmail.com> wrote: > Do you have some specific performance measurements? Averaging 4 runs of 10000 iterations each of Coremark on my X86_64 desktop showed: -O2 performance: +2.9% faster with the L.E.V. pass -Os size: 1.5% smaller with the L.E.V. pass In the case of Coremark, the benefit comes mainly from the matrix
2017 Apr 10
2
LSR
Hi, I find that LSR is not helping enough on avoiding unfoldable offsets for SystemZ. When the loop has three stores with unfoldable offsets, LSR rewrites the IV in a good way. However, if adding another store with a foldable offset that fits already, LSR fails to rewrite the three stores. And if I happen to add a...
2017 Apr 11
2
LSR
>> Has anyone any idea on how to best handle this? Can LSR "split" an IV >> to use an extra register? Or would this need to be done in a target >> specific pass? > > When you say "an extra address register" would this imply LSR adding > an additional PHI? > > -Hal > Yes, that would have worked well at l...
2017 Jul 31
1
LLVM's loop strength reduction module
Hi, Sorry I took a long time to reply as it took me some time to get some understanding of the code even to ask some specific questions (I have a test case in which LSR does not kick in and wanted to understand the code to figure out why it was not kicking in). Here are some specific questions I have: 1) It appears that LSR works only for the inner-most loop. Is this correct? Can you tell me why this is so? I believe SCEV works for nested loops, right? 2)...
2017 Jul 06
3
LLVM's loop strength reduction module
Hi Raghavan, I concur no specific docs. What do you want to know specifically? Cheers, -Quentin > On Jul 5, 2017, at 11:16 PM, Madhur Amilkanthwar via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > AFAIK, no official doc. > You can probably get better help if you ask specific questions (which part of the code you don't understand). > > On Thu, Jul 6, 2017 at 9:53
2011 Oct 12
1
[PATCH] ns16550: fix poll handling regression
Prior to c/s 23811:f1349a968a5a LSR_THRE was checked only once, while there it got promoted into the surrounding loop''s condition. Since that bit may not clear for an extended period of time (i.e. when no new output is generated), it must not be used in this way indefinitely. Signed-off-by: Jan Beulich <jbeulich@suse.com...
2013 Mar 14
3
[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
...& BaseGV) const; In NarrowSearchSpaceByPickingWinnerRegs, we can preserves the winner reg from target and winner reg from the original algorithm if this function returns NULL, it is just like before For case two, we can define a general cost from TTI function, like virtual int getLSRFormulaCost(const unsigned NumRegs, const unsigned AddRecCost, const unsigned NumIVMuls, const unsigned NumBaseAdds, const unsigned ImmCost, const unsigned SetupCost) const; Then we do something like int this...
2013 Mar 14
0
[LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please
...l Message ----- > From: "Yin Ma" <yinma at codeaurora.org> > To: "Andrew Trick" <atrick at apple.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Thursday, March 14, 2013 4:21:50 PM > Subject: Re: [LLVMdev] Suggestion About Adding Target Dependent Decision in LSR Please > > > > > > Hi Andy, > > > > Actually, if we just add hooks that preserves the existing behavior, > > It is not difficult. For example, > > > > For case one, we can define one function like > > virtual const SCEV* getTargetPr...
2012 Nov 26
2
[LLVMdev] LSR pass
Hi, I would like some help regarding the LSR pass. It seems that it likes to duplicate address calculations as in the case above, which is highly undesirable on my target. I wonder if there is any way to tell LSR to not duplicate the code in cases like this? Or could I perhaps run CSE after LSR again? What is the logic behind this transforma...
2015 Aug 18
2
RFC for a design change in LoopStrengthReduce / ScalarEvolution
...> `GEP @Global, zext(V)` -> `GEP (@Global + zext VStart), {i64 0,+,1}` > `V` -> `trunc({i64 0,+,1}) + VStart` > > instead of the actually-better solution: > > `GEP @Global, zext(V)` -> `GEP @Global, zext({VStart,+,1})` > `V` -> `{VStart,+,1}` > > where LSR never considers the latter case because it transforms: > > `zext({VStart,+,1})` to `{zext VStart,+,1}` > > and, thus, never considers the formula with zext on the outside? Your proposed solution is that LSR should be able to create: > > zext(opaque({VStart,+,1})) > > for...
2010 Aug 11
2
[LLVMdev] LSR is Unbearably Slow
I just filed bug 7872 about non-scalability of the LSR analysis algorithms. It may be related to bug 6727. The fundamental problem appears to be re-running SCEV analyses such as properlyDominates and SCEVComplexityCompare over and over again on large SCEV expressions. Memoizing results for SCEVComplexityCompare appears to help significantly but that...
2012 Dec 04
0
[LLVMdev] LSR pass
Hi, The target supports indexing by register or immediate. Multiplications are not supported by any load / store instructions. Would it be possible to make LSR aware of this? Thanks, Jonas Paulsson -----Original Message----- From: Hal Finkel [mailto:hfinkel at anl.gov] Sent: Saturday, December 01, 2012 5:59 AM To: Jonas Paulsson Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] LSR pass ----- Original Message ----- > From: "Jonas Paulsson&qu...
2014 Feb 19
2
[LLVMdev] better code for IV
Hi Andrew, The issue below refers to LSR, so I'll appreciate your feedback. It also refers to instruction combining and might impact backends other than X86, so if you know of others that might be interested you are more than welcome to add them. Thanks, Anat _____________________________________________ From: Shemer, Anat Sent: Tue...