thr3ads.net - search: "polymage"

Displaying 5 results from an estimated 5 matches for "polymage".

AVX2 codegen - question reg. FMA generation

2019 Sep 02

AVX2 codegen - question reg. FMA generation

...--------------------------------------------------------------------- > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Founder and Director, PolyMage Labs

AVX2 codegen - question reg. FMA generation

2019 Sep 02

AVX2 codegen - question reg. FMA generation

Hello, On the appended reasonably simple test case that has an fmul/fadd sequence on <8 x float> vector types, I don't see the x86-64 code generator (with cpu set to haswell or later types) turning it into an AVX2 FMA instructions. Here's the snippet in the output it generates: $ llc -O3 -mcpu=skylake --------------------- .LBB0_2: # =>This Inner

Writing loop transformations on the right representation is more productive

2020 Jan 30

Writing loop transformations on the right representation is more productive

Am Mo., 27. Jan. 2020 um 22:06 Uhr schrieb Uday Kumar Reddy Bondhugula < uday at polymagelabs.com>: > Hi Michael, > > Although the approach to use a higher order in-memory abstraction like the > loop tree will make it easier than what you have today, if you used MLIR > for this representation, you already get a round trippable textual format > that is *very close*...

Writing loop transformations on the right representation is more productive

2020 Feb 03

Writing loop transformations on the right representation is more productive

Am Do., 30. Jan. 2020 um 04:40 Uhr schrieb Uday Kumar Reddy Bondhugula <uday at polymagelabs.com>: > There are multiple ways regions in MLIR can be viewed, but the more relevant point here is you do have a loop tree structure native in the IR with MLIR. Regions in MLIR didn't evolve from modeling inlined calls - the affine.for/affine.if were originally the only two operations...

Writing loop transformations on the right representation is more productive

2020 Jan 03

Writing loop transformations on the right representation is more productive

In the 2018 LLVM DevMtg [1], I presented some shortcomings of how LLVM optimizes loops. In summary, the biggest issues are (a) the complexity of writing a new loop optimization pass (including needing to deal with a variety of low-level issues, a significant amount of required boilerplate, the difficulty of analysis preservation, etc.), (b) independent optimization heuristics and a fixed pass

search for: polymage