search for: polymag

Displaying 5 results from an estimated 5 matches for "polymag".

Did you mean: polymap
2019 Sep 02
2
AVX2 codegen - question reg. FMA generation
...--------------------------------------------------------------------- > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Founder and Director, PolyMage Labs
2019 Sep 02
3
AVX2 codegen - question reg. FMA generation
Hello, On the appended reasonably simple test case that has an fmul/fadd sequence on <8 x float> vector types, I don't see the x86-64 code generator (with cpu set to haswell or later types) turning it into an AVX2 FMA instructions. Here's the snippet in the output it generates: $ llc -O3 -mcpu=skylake --------------------- .LBB0_2: # =>This Inner
2020 Jan 30
2
Writing loop transformations on the right representation is more productive
Am Mo., 27. Jan. 2020 um 22:06 Uhr schrieb Uday Kumar Reddy Bondhugula < uday at polymagelabs.com>: > Hi Michael, > > Although the approach to use a higher order in-memory abstraction like the > loop tree will make it easier than what you have today, if you used MLIR > for this representation, you already get a round trippable textual format > that is *very close*...
2020 Feb 03
5
Writing loop transformations on the right representation is more productive
Am Do., 30. Jan. 2020 um 04:40 Uhr schrieb Uday Kumar Reddy Bondhugula <uday at polymagelabs.com>: > There are multiple ways regions in MLIR can be viewed, but the more relevant point here is you do have a loop tree structure native in the IR with MLIR. Regions in MLIR didn't evolve from modeling inlined calls - the affine.for/affine.if were originally the only two operation...
2020 Jan 03
10
Writing loop transformations on the right representation is more productive
In the 2018 LLVM DevMtg [1], I presented some shortcomings of how LLVM optimizes loops. In summary, the biggest issues are (a) the complexity of writing a new loop optimization pass (including needing to deal with a variety of low-level issues, a significant amount of required boilerplate, the difficulty of analysis preservation, etc.), (b) independent optimization heuristics and a fixed pass