thr3ads.net - search: "matrix

Displaying 7 results from an estimated 7 matches for "matrix_mul".

2015 Sep 01

[RFC] New pass: LoopExitValues

...L.E.V. pass -Os size: 1.5% smaller with the L.E.V. pass In the case of Coremark, the benefit comes mainly from the matrix portion benchmark, which uses nested loops. Similarly, I used a matrix multiplication for the regression test as shown below. The L.E.V. pass eliminated 4 instructions. void matrix_mul(unsigned int Size, unsigned int *Dst, unsigned int *Src, unsigned int Val) { for (int Outer = 0; Outer < Size; ++Outer) for (int Inner = 0; Inner < Size; ++Inner) Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val; } With LoopExitValues -------------------------------...

[RFC] New pass: LoopExitValues

2015 Sep 03

[RFC] New pass: LoopExitValues

...it confused about what pattern exactly this pass is supposed to > trigger on. I understand the mechanics, but I still can't quite see what > patterns it would be useful on. You've mentioned matrix multiply - how does > this pass alter the IR? Here's before and after IR for the matrix_mul example. Notice the two bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V pass converts these to scevgep values that already exist. *** Code after LSR *** ; Function Attrs: nounwind optsize define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly %Src,...

[RFC] New pass: LoopExitValues

2015 Sep 11

[RFC] New pass: LoopExitValues

Hi Steve it seems the general consensus is that the patch feels like a work-around for a problem with LSR (and possibly other loop transformations) that introduces redundant instructions. It is probably best to file a bug and a few of your test cases. Thanks Gerolf > On Sep 10, 2015, at 4:37 PM, Steve King via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Thu, Sep 10, 2015

[RFC] New pass: LoopExitValues

2015 Aug 31

[RFC] New pass: LoopExitValues

Hello LLVM, This is a proposal for a new pass that improves performance and code size in some nested loop situations. The pass is target independent. >From the description in the file header: This optimization finds loop exit values reevaluated after the loop execution and replaces them by the corresponding exit values if they are available. Such sequences can arise after the

[RFC] New pass: LoopExitValues

2015 Sep 10

[RFC] New pass: LoopExitValues

...supposed to > >> trigger on. I understand the mechanics, but I still can't quite see what > >> patterns it would be useful on. You've mentioned matrix multiply - how > does > >> this pass alter the IR? > > > > Here's before and after IR for the matrix_mul example. Notice the two > > bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V > > pass converts these to scevgep values that already exist. > > > > *** Code after LSR *** > > > > ; Function Attrs: nounwind optsize > > define void @matr...

[RFC] New pass: LoopExitValues

2015 Sep 26

[RFC] New pass: LoopExitValues

...n; } I don't know how much time you're willing to commit to this, but perhaps a more principled fix is to change LSR to actually work with nested loops? If I comment out this change, after LSR the matric_mul routine does not actually look any better (possibly even worse): define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly %Src, i32 %Val) { entry: %Src12 = bitcast i32* %Src to i8* %Dst14 = bitcast i32* %Dst to i8* %cmp.25 = icmp eq i32 %Size, 0 br i1 %cmp.25, label %for.cond.cleanup, label %for.body.4.lr.ph.preheader for.body.4.lr.ph.preheader:...

[RFC] New pass: LoopExitValues

2015 Sep 23

[RFC] New pass: LoopExitValues

On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> Should we try the patch in it's current location, namely after LSR? > > Sure; post the patch as you have it so we can look at what's going on. > http://reviews.llvm.org/D12494 One particular point: The algorithm checks that SCEV's are equal when their raw pointers are equal. Is

search for: matrix_mul