thr3ads.net - llvm dev - [llvm-dev] [RFC] New pass: LoopExitValues [Sep 2015]

If this information is useful, please help other people find it:
Share via:

Steve King via llvm-dev

2015-Sep-23 19:09 UTC

[llvm-dev] [RFC] New pass: LoopExitValues

On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov>
wrote:>>
>> Should we try the patch in it's current location, namely after LSR?
>
> Sure; post the patch as you have it so we can look at what's going on.
>
http://reviews.llvm.org/D12494

One particular point: The algorithm checks that SCEV's are equal when
their raw pointers are equal.  Is that a future-proof feature of
SCEVs?

Hal Finkel via llvm-dev

2015-Sep-23 19:33 UTC

head link

[llvm-dev] [RFC] New pass: LoopExitValues

----- Original Message -----> From: "Steve King" <steve at metrokings.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "Wei Mi" <wmi at google.com>, "Philip Reames"
<listmail at philipreames.com>, "llvm-dev" <llvm-dev at
lists.llvm.org>
> Sent: Wednesday, September 23, 2015 2:09:52 PM
> Subject: Re: [llvm-dev] [RFC] New pass: LoopExitValues
> 
> On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
> >>
> >> Should we try the patch in it's current location, namely after
> >> LSR?
> >
> > Sure; post the patch as you have it so we can look at what's going
> > on.
> >
> 
> http://reviews.llvm.org/D12494
> 
> One particular point: The algorithm checks that SCEV's are equal when
> their raw pointers are equal.  Is that a future-proof feature of
> SCEVs?
Yes. If we ever change that, we'll have a lot of code to update.

 -Hal
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Mehdi Amini via llvm-dev

2015-Sep-24 02:35 UTC

head link

[llvm-dev] [RFC] New pass: LoopExitValues

Hi,
> On Sep 23, 2015, at 12:09 PM, Steve King via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov>
wrote:
>>> 
>>> Should we try the patch in it's current location, namely after
LSR?
>> 
>> Sure; post the patch as you have it so we can look at what's going
on.
>> 
> 
> http://reviews.llvm.org/D12494
> 
> One particular point: The algorithm checks that SCEV's are equal when
> their raw pointers are equal.  Is that a future-proof feature of
> SCEVs?

I don’t know much about this optimization or its implementation, but I just ran
it on our test-suite (both the public one and some internal benchmarks) on OS X
(X86) and I didn’t observe much changes:

- 3.37% regression on SingleSource/Benchmarks/Shootout/matrix
- 3.87% and 2.46% improvements on two SPEC benchmarks.

(Patch applied on top of r248302).

— 
Mehdi

Sanjoy Das via llvm-dev

2015-Sep-26 23:28 UTC

head link

[llvm-dev] [RFC] New pass: LoopExitValues

Hi Steve,

Do you primarily find this to help for nested loops?  If so, that
could be because LSR explicitly bails out of processing them:

   // Skip nested loops until we can model them better with formulae.
   if (!L->empty()) {
     DEBUG(dbgs() << "LSR skipping outer loop " << *L
<< "n");
     return;
   }

I don't know how much time you're willing to commit to this, but
perhaps a more principled fix is to change LSR to actually work with
nested loops?

If I comment out this change, after LSR the matric_mul routine does
not actually look any better (possibly even worse):

define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly
%Src, i32 %Val) {
entry:
   %Src12 = bitcast i32* %Src to i8*
   %Dst14 = bitcast i32* %Dst to i8*
   %cmp.25 = icmp eq i32 %Size, 0
   br i1 %cmp.25, label %for.cond.cleanup, label %for.body.4.lr.ph.preheader

for.body.4.lr.ph.preheader:                       ; preds = %entry
   %0 = shl i32 %Size, 2
   br label %for.body.4.lr.ph

for.body.4.lr.ph:                                 ; preds =
%for.body.4.lr.ph.preheader, %for.cond.cleanup.3
   %lsr.iv17 = phi i32 [ %Size, %for.body.4.lr.ph.preheader ], [ %lsr.iv.next18,
%for.cond.cleanup.3 ]
   %lsr.iv10 = phi i32 [ 0, %for.body.4.lr.ph.preheader ], [ %lsr.iv.next11,
%for.cond.cleanup.3 ]
   %uglygep = getelementptr i8, i8* %Src12, i32 %lsr.iv10
   %uglygep13 = bitcast i8* %uglygep to i32*
   %uglygep15 = getelementptr i8, i8* %Dst14, i32 %lsr.iv10
   %uglygep1516 = bitcast i8* %uglygep15 to i32*
   br label %for.body.4

for.body.4:                                       ; preds = %for.body.4,
%for.body.4.lr.ph
   %lsr.iv8 = phi i32* [ %scevgep9, %for.body.4 ], [ %uglygep13,
%for.body.4.lr.ph ]
   %lsr.iv3 = phi i32* [ %scevgep4, %for.body.4 ], [ %uglygep1516,
%for.body.4.lr.ph ]
   %lsr.iv = phi i32 [ %lsr.iv.next, %for.body.4 ], [ %Size, %for.body.4.lr.ph ]
   %1 = load i32, i32* %lsr.iv8, align 4, !tbaa !0
   %mul5 = mul i32 %1, %Val
   store i32 %mul5, i32* %lsr.iv3, align 4, !tbaa !0
   %lsr.iv.next = add i32 %lsr.iv, -1
   %scevgep4 = getelementptr i32, i32* %lsr.iv3, i32 1
   %scevgep9 = getelementptr i32, i32* %lsr.iv8, i32 1
   %exitcond = icmp eq i32 %lsr.iv.next, 0
   br i1 %exitcond, label %for.cond.cleanup.3, label %for.body.4

for.cond.cleanup.3:                               ; preds = %for.body.4
   %lsr.iv.next11 = add i32 %lsr.iv10, %0
   %lsr.iv.next18 = add i32 %lsr.iv17, -1
   %exitcond27 = icmp eq i32 %lsr.iv.next18, 0
   br i1 %exitcond27, label %for.cond.cleanup.loopexit, label %for.body.4.lr.ph

for.cond.cleanup.loopexit:                        ; preds = %for.cond.cleanup.3
   br label %for.cond.cleanup

for.cond.cleanup:                                 ; preds =
%for.cond.cleanup.loopexit, %entry
   ret void
}


-- Sanjoy

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Sep 2015 - [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

[llvm-dev] [RFC] New pass: LoopExitValues

Possibly Parallel Threads