search for: matrix_mul

Displaying 7 results from an estimated 7 matches for "matrix_mul".

2015 Sep 01
2
[RFC] New pass: LoopExitValues
...L.E.V. pass -Os size: 1.5% smaller with the L.E.V. pass In the case of Coremark, the benefit comes mainly from the matrix portion benchmark, which uses nested loops. Similarly, I used a matrix multiplication for the regression test as shown below. The L.E.V. pass eliminated 4 instructions. void matrix_mul(unsigned int Size, unsigned int *Dst, unsigned int *Src, unsigned int Val) { for (int Outer = 0; Outer < Size; ++Outer) for (int Inner = 0; Inner < Size; ++Inner) Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val; } With LoopExitValues -------------------------------...
2015 Sep 03
2
[RFC] New pass: LoopExitValues
...it confused about what pattern exactly this pass is supposed to > trigger on. I understand the mechanics, but I still can't quite see what > patterns it would be useful on. You've mentioned matrix multiply - how does > this pass alter the IR? Here's before and after IR for the matrix_mul example. Notice the two bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V pass converts these to scevgep values that already exist. *** Code after LSR *** ; Function Attrs: nounwind optsize define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly %Src,...
2015 Sep 11
5
[RFC] New pass: LoopExitValues
Hi Steve it seems the general consensus is that the patch feels like a work-around for a problem with LSR (and possibly other loop transformations) that introduces redundant instructions. It is probably best to file a bug and a few of your test cases. Thanks Gerolf > On Sep 10, 2015, at 4:37 PM, Steve King via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > On Thu, Sep 10, 2015
2015 Aug 31
2
[RFC] New pass: LoopExitValues
Hello LLVM, This is a proposal for a new pass that improves performance and code size in some nested loop situations. The pass is target independent. >From the description in the file header: This optimization finds loop exit values reevaluated after the loop execution and replaces them by the corresponding exit values if they are available. Such sequences can arise after the
2015 Sep 10
2
[RFC] New pass: LoopExitValues
...supposed to > >> trigger on. I understand the mechanics, but I still can't quite see what > >> patterns it would be useful on. You've mentioned matrix multiply - how > does > >> this pass alter the IR? > > > > Here's before and after IR for the matrix_mul example. Notice the two > > bitcasts %1 and %2 generated in the for.cond.cleanup block. The L.E.V > > pass converts these to scevgep values that already exist. > > > > *** Code after LSR *** > > > > ; Function Attrs: nounwind optsize > > define void @matr...
2015 Sep 26
2
[RFC] New pass: LoopExitValues
...n; } I don't know how much time you're willing to commit to this, but perhaps a more principled fix is to change LSR to actually work with nested loops? If I comment out this change, after LSR the matric_mul routine does not actually look any better (possibly even worse): define void @matrix_mul(i32 %Size, i32* nocapture %Dst, i32* nocapture readonly %Src, i32 %Val) { entry: %Src12 = bitcast i32* %Src to i8* %Dst14 = bitcast i32* %Dst to i8* %cmp.25 = icmp eq i32 %Size, 0 br i1 %cmp.25, label %for.cond.cleanup, label %for.body.4.lr.ph.preheader for.body.4.lr.ph.preheader:...
2015 Sep 23
3
[RFC] New pass: LoopExitValues
On Wed, Sep 23, 2015 at 12:00 PM, Hal Finkel <hfinkel at anl.gov> wrote: >> >> Should we try the patch in it's current location, namely after LSR? > > Sure; post the patch as you have it so we can look at what's going on. > http://reviews.llvm.org/D12494 One particular point: The algorithm checks that SCEV's are equal when their raw pointers are equal. Is