Steve (Numerics) Canon via llvm-dev
2020-Dec-04 19:23 UTC
[llvm-dev] Complex proposal v3 + roundtable agenda
(Late to the party) I think there’s a lot of good questions in this thread, but I also want Florian to get started landing some patches and then everyone can iterate on that. Slight pushback on this one question:> On Nov 19, 2020, at 18:11, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote:>> On Wed, Nov 18, 2020 at 4:47 PM Krzysztof Parzyszek via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> Complex type would pose another issue for vectorization: in general it's better to have a vector of the real parts and a vector of the imaginary parts for the purpose of arithmetic, than having vectors of complex elements (i.e. real and imaginary parts interleaved). > > Is that universally true? I think it depends on the target. Let's take > Florian's FCMLA example. The inputs and output are interleaved. And if > you need just the reals/imags from an interleaved vector for something > else, LD2/ST2 should be pretty fast on recent chips.FCMLA makes it so _if your data is already interleaved and you can’t change it_, you can still operate efficiently in that format. But if you have control, and you’re doing anything beyond basic arithmetic, it’s still advantageous to use a planar [SoA] layout instead of interleaved [AoS]. For a simple example, if you want to vectorize a complex exponential function, you’ll want to compute cos and sin of the imaginary parts, and exp of the real parts—FCMLA doesn’t help here. So Florian’s proposal:> In the short-term I think to get things rolling it would be good to focus on the layout as defined by frontends (e.g. Clang/C++).Sounds great to me. – Steve
Sjoerd Meijer via llvm-dev
2021-Sep-30 13:23 UTC
[llvm-dev] Complex proposal v3 + roundtable agenda
Hello, I would like to revive this thread. We started thinking about supporting complex number support, for the same reason Florian stated in his first mail. The idea before I googled and stumbled on this thread, was the same as Florian proposed. So yes, I would like to support this approach. 🙂 I would like to add that GCC's approach is the same/similar: complex number patterns are recognised in the SLP vectoriser, and complex number builtins are emitted (and then matched later). Any objections if we go for this approach? Florian, any plans to pick this up? Cheers, Sjoerd. ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Steve (Numerics) Canon via llvm-dev <llvm-dev at lists.llvm.org> Sent: 04 December 2020 19:23 To: David Greene <greened at obbligato.org> Cc: llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Complex proposal v3 + roundtable agenda (Late to the party) I think there’s a lot of good questions in this thread, but I also want Florian to get started landing some patches and then everyone can iterate on that. Slight pushback on this one question:> On Nov 19, 2020, at 18:11, Cameron McInally via llvm-dev <llvm-dev at lists.llvm.org> wrote:>> On Wed, Nov 18, 2020 at 4:47 PM Krzysztof Parzyszek via llvm-dev >> <llvm-dev at lists.llvm.org> wrote: >> Complex type would pose another issue for vectorization: in general it's better to have a vector of the real parts and a vector of the imaginary parts for the purpose of arithmetic, than having vectors of complex elements (i.e. real and imaginary parts interleaved). > > Is that universally true? I think it depends on the target. Let's take > Florian's FCMLA example. The inputs and output are interleaved. And if > you need just the reals/imags from an interleaved vector for something > else, LD2/ST2 should be pretty fast on recent chips.FCMLA makes it so _if your data is already interleaved and you can’t change it_, you can still operate efficiently in that format. But if you have control, and you’re doing anything beyond basic arithmetic, it’s still advantageous to use a planar [SoA] layout instead of interleaved [AoS]. For a simple example, if you want to vectorize a complex exponential function, you’ll want to compute cos and sin of the imaginary parts, and exp of the real parts—FCMLA doesn’t help here. So Florian’s proposal:> In the short-term I think to get things rolling it would be good to focus on the layout as defined by frontends (e.g. Clang/C++).Sounds great to me. – Steve _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210930/107005d5/attachment.html>
Florian Hahn via llvm-dev
2021-Oct-04 10:16 UTC
[llvm-dev] Complex proposal v3 + roundtable agenda
> On Sep 30, 2021, at 14:23, Sjoerd Meijer <Sjoerd.Meijer at arm.com> wrote: > > Hello, > > I would like to revive this thread. > > We started thinking about supporting complex number support, for the same reason Florian stated in his first mail. The idea before I googled and stumbled on this thread, was the same as Florian proposed. So yes, I would like to support this approach. 🙂 I would like to add that GCC's approach is the same/similar: complex number patterns are recognised in the SLP vectoriser, and complex number builtins are emitted (and then matched later). > > Any objections if we go for this approach? Florian, any plans to pick this up?I still think the intrinsics route is a good way to start and to incrementally improve support for architectures that have dedicated instructions for complex multiply & co relatively quickly. I rebased the patches sketching support for such an intrinsic on AArch64 (https://reviews.llvm.org/D91346 <https://reviews.llvm.org/D91346> and stacked patches). But I won’t be able to continue lobbying for this approach in the near future. I’d be more than happy if anybody would be interested in picking this up. Cheers, Florian -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211004/569eb700/attachment-0001.html>