thr3ads.net - llvm dev - [llvm-dev] Complex proposal v3 + roundtable agenda [Nov 2020]

If this information is useful, please help other people find it:
Share via:

Cameron McInally via llvm-dev

2020-Nov-12 18:52 UTC

[llvm-dev] Complex proposal v3 + roundtable agenda

On Thu, Nov 12, 2020 at 12:03 PM Florian Hahn via llvm-dev
<llvm-dev at lists.llvm.org> wrote:>
> Hi,
>
> There’s growing interest among our users to make better use of dedicated
hardware instructions for complex math and I would like to re-start the
discussion on the topic. Given that this original thread was started a while ago
apologies if I missed anything already discussed earlier on the list or the
round-table. The original mail is quoted below.
>
> In particular, I’m interested in the AArch64 side of things, like using
FCMLA [1] for complex multiplications to start with.
>
> To get the discussion going, I’d like to share an alternative pitch.
Instead of starting with adding complex types, we could start with adding a set
of intrinsics that operate on complex values packed into vectors instead.
>
> Starting with intrinsics would allow us to bring up the lowering of those
intrinsics to target-specific nodes incrementally without having to make
substantial changes across the codebase, as adding new types would require.
Initially, we could try and match IR patterns that correspond to complex
operations late in the pipeline. We can then work on incrementally moving the
point where the intrinsics are introduced earlier in the pipeline, as we adopt
more passes to deal with them. This way, we won’t have to teach all passes about
complex types at once or risk loosing out all the existing combines on the
corresponding floating point operations.
>
> I think if we introduce a small set of intrinsics for complex math (like
@llvm.complex.multiply) we could use them to improve code-generation in key
passes like the vectorizers and deliver large improvements to our users fairly
quickly. There might be some scenarios which require a dedicated IR type, but I
think we can get a long way with a set of specialized intrinsics at a much lower
cost. If we later decide that dedicated IR types are needed, replacing the
intrinsics should be easy and we will benefit of having already updated various
passes to deal with the intrinsics.
>
> We took a similar approach when adding matrix support to LLVM and I think
that worked out very well in the end. The implementation upstream generates
equivalent or better code than our earlier implementation using dedicated IR
matrix types, while being simpler and impacting a much smaller area of the
codebase.
>
> An independent issue to discuss is how to generate complex math intrinsics.
> As part of the initial bring-up, I’d propose matching the code Clang
generates for operations on std::complex<> & co to introduce the
complex math intrinsics. This won’t be perfect and will miss cases, but allows
us to deliver initial improvements without requiring extensive updates to
existing libraries or frontends. I don’t think either the intrinsic only or the
complex type variants are inherently more convenient for frontends to emit.
>
> To better illustrate what this approach could look like, I put up a set of
rough patches that introduce a @llvm.complex.multiply intrinsic
(https://reviews.llvm.org/D91347), replace a set of fadd/fsub/fmul instructions
with @llvm.complex.multiply (https://reviews.llvm.org/D91353) and  lower the
intrinsic for FCMLA on AArch64 (https://reviews.llvm.org/D91354). Note that
those are just rough proof-of-concept patches.
>
> Cheers,
> Florian
Hi Florian,

The proposed experimental intrinsics are a difficult detour to accept
for performance reasons. With a complex type, the usual algebraic
simplifications fall out for free (or close to it). Teaching existing
optimizations how to handle the new complex intrinsics seems like a
LOT of unnecessary work.

That said, we recently had this same conversation at Simon Moll's
native predication sync-up meeting. Simon had some convincing ways to
workaround predicated intrinsic optimization (e.g. the
PredicatedInstruction class). Maybe we should explore a more
generalized solution that would cover complex intrinsics too?

Digressing a bit, have we ever discussed using a branch to develop
something like complex support? That way we would avoid an
experimental intrinsic implementation, but also not disturb the
codebase until the implementation is complete.

-Cameron

Florian Hahn via llvm-dev

2020-Nov-12 19:47 UTC

head link

[llvm-dev] Complex proposal v3 + roundtable agenda

> On Nov 12, 2020, at 18:52, Cameron McInally <cameron.mcinally at
nyu.edu> wrote:
> 
> On Thu, Nov 12, 2020 at 12:03 PM Florian Hahn via llvm-dev
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
>> 
>> Hi,
>> 
>> There’s growing interest among our users to make better use of
dedicated hardware instructions for complex math and I would like to re-start
the discussion on the topic. Given that this original thread was started a while
ago apologies if I missed anything already discussed earlier on the list or the
round-table. The original mail is quoted below.
>> 
>> In particular, I’m interested in the AArch64 side of things, like using
FCMLA [1] for complex multiplications to start with.
>> 
>> To get the discussion going, I’d like to share an alternative pitch.
Instead of starting with adding complex types, we could start with adding a set
of intrinsics that operate on complex values packed into vectors instead.
>> 
>> Starting with intrinsics would allow us to bring up the lowering of
those intrinsics to target-specific nodes incrementally without having to make
substantial changes across the codebase, as adding new types would require.
Initially, we could try and match IR patterns that correspond to complex
operations late in the pipeline. We can then work on incrementally moving the
point where the intrinsics are introduced earlier in the pipeline, as we adopt
more passes to deal with them. This way, we won’t have to teach all passes about
complex types at once or risk loosing out all the existing combines on the
corresponding floating point operations.
>> 
>> I think if we introduce a small set of intrinsics for complex math
(like @llvm.complex.multiply) we could use them to improve code-generation in
key passes like the vectorizers and deliver large improvements to our users
fairly quickly. There might be some scenarios which require a dedicated IR type,
but I think we can get a long way with a set of specialized intrinsics at a much
lower cost. If we later decide that dedicated IR types are needed, replacing the
intrinsics should be easy and we will benefit of having already updated various
passes to deal with the intrinsics.
>> 
>> We took a similar approach when adding matrix support to LLVM and I
think that worked out very well in the end. The implementation upstream
generates equivalent or better code than our earlier implementation using
dedicated IR matrix types, while being simpler and impacting a much smaller area
of the codebase.
>> 
>> An independent issue to discuss is how to generate complex math
intrinsics.
>> As part of the initial bring-up, I’d propose matching the code Clang
generates for operations on std::complex<> & co to introduce the
complex math intrinsics. This won’t be perfect and will miss cases, but allows
us to deliver initial improvements without requiring extensive updates to
existing libraries or frontends. I don’t think either the intrinsic only or the
complex type variants are inherently more convenient for frontends to emit.
>> 
>> To better illustrate what this approach could look like, I put up a set
of rough patches that introduce a @llvm.complex.multiply intrinsic
(https://reviews.llvm.org/D91347), replace a set of fadd/fsub/fmul instructions
with @llvm.complex.multiply (https://reviews.llvm.org/D91353) and  lower the
intrinsic for FCMLA on AArch64 (https://reviews.llvm.org/D91354). Note that
those are just rough proof-of-concept patches.
>> 
>> Cheers,
>> Florian
> 
> 
> The proposed experimental intrinsics are a difficult detour to accept
> for performance reasons. With a complex type, the usual algebraic
> simplifications fall out for free (or close to it). Teaching existing
> optimizations how to handle the new complex intrinsics seems like a
> LOT of unnecessary work.
Thanks for taking a look!

Could you expand a bit more on what kind of unnecessary work you expect? I would
expect most of the code to deal with intrinsics to be easily migrated once/if we
decide to switch to a dedicated type.

Concretely, for the lowering code, it should hopefully just boil down to
updating the patterns that get matched in the backends (matching the complex
multiply instruction instead of @llvm.complex.multiply). For supporting complex
math in the vectorizers, we have to add support for cost-modeling and support
widening the intrinsics. Again, wouldn’t’ changing from intrinsics to a type
just mean adjusting from dealing with intrinsic calls to the corresponding
instructions?

There certainly is some pieces around the edges that will need adjusting or
become obsolete, but I would hope/expect the majority of the work to be
re-usable.

As for getting the usual algebraic simplifications for free, even with a new
type I suppose we would have to teach instcombine/InstructionSimplify about
them. This is indeed an area where a dedicated type is probably quite a bit
easier to deal with. But I think part of the vector-predication proposal
includes some generic matchers for the different intrinsic, which should be
relatively straight-forward to update as well.

I’ll admit that the scope of my pitch is much more limited than the original
proposal and very focused on allowing LLVM to use specialized instructions. But
it should be relatively simple to implement (without impacting anything
unrelated to the passes/backends that are adjusted) and easy to extend to meet
additional needs by other people & backends. It also allows to reuse the
existing instructions for insert/extract/shuffles and the corresponding folds.

Also I’ve been looking at this through an AArch64 lens, so this proposal should
be easy to map to the available instructions there. I would appreciate any
thoughts on how this might clash with the available instructions on other
targets.
> 
> That said, we recently had this same conversation at Simon Moll's
> native predication sync-up meeting. Simon had some convincing ways to
> workaround predicated intrinsic optimization (e.g. the
> PredicatedInstruction class). Maybe we should explore a more
> generalized solution that would cover complex intrinsics too?
> 
> Digressing a bit, have we ever discussed using a branch to develop
> something like complex support? That way we would avoid an
> experimental intrinsic implementation, but also not disturb the
> codebase until the implementation is complete.
Personally I would be cautious to go down that road. I think there is a big
benefit to being able to bring up support on the main branch, in a way that the
code is exercised by default. It provides substantial test coverage and allows
us to spot regressions early on, hence avoiding a ‘big switch’ at the end. And
it allows people to collaborate more easily then on a separate branch, which
also needs to be kept up-to-date.

I would expect adding dedicated complex types to be a big project, which will
require a substantial amount of work to be done up front before yielding
tangible benefits. If there’s a clear indication that dedicated types make our
lives substantially easier, there’s buy-in and people willing to tackle this, I
am all for it. Historically however, getting new types in has been challenging
and there are quite a few pieces in LLVM that have been added but were never
pushed across the finish line. Whatever direction we take, this is something I
would like to avoid.

Cheers,
Florian

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20201112/c73d4f9a/attachment.html>

Krzysztof Parzyszek via llvm-dev

2020-Nov-12 21:13 UTC

head link

[llvm-dev] Complex proposal v3 + roundtable agenda

Some architectures have instructions that assist with complex arithmetic. 
Without intrinsics it may be hard to use such instructions especially because of
the arithmetic simplifications.  Perhaps, depending on TTI, those intrinsics
could be expanded into the explicit arithmetic?


--
Krzysztof Parzyszek  kparzysz at quicinc.com   AI tools development

-----Original Message-----
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Cameron
McInally via llvm-dev
Sent: Thursday, November 12, 2020 12:53 PM
To: Florian Hahn <florian_hahn at apple.com>
Cc: David Greene <dag at cray.com>; llvm-dev at lists.llvm.org
Subject: [EXT] Re: [llvm-dev] Complex proposal v3 + roundtable agenda

On Thu, Nov 12, 2020 at 12:03 PM Florian Hahn via llvm-dev <llvm-dev at
lists.llvm.org> wrote:>
> Hi,
>
> There’s growing interest among our users to make better use of dedicated
hardware instructions for complex math and I would like to re-start the
discussion on the topic. Given that this original thread was started a while ago
apologies if I missed anything already discussed earlier on the list or the
round-table. The original mail is quoted below.
>
> In particular, I’m interested in the AArch64 side of things, like using
FCMLA [1] for complex multiplications to start with.
>
> To get the discussion going, I’d like to share an alternative pitch.
Instead of starting with adding complex types, we could start with adding a set
of intrinsics that operate on complex values packed into vectors instead.
>
> Starting with intrinsics would allow us to bring up the lowering of those
intrinsics to target-specific nodes incrementally without having to make
substantial changes across the codebase, as adding new types would require.
Initially, we could try and match IR patterns that correspond to complex
operations late in the pipeline. We can then work on incrementally moving the
point where the intrinsics are introduced earlier in the pipeline, as we adopt
more passes to deal with them. This way, we won’t have to teach all passes about
complex types at once or risk loosing out all the existing combines on the
corresponding floating point operations.
>
> I think if we introduce a small set of intrinsics for complex math (like
@llvm.complex.multiply) we could use them to improve code-generation in key
passes like the vectorizers and deliver large improvements to our users fairly
quickly. There might be some scenarios which require a dedicated IR type, but I
think we can get a long way with a set of specialized intrinsics at a much lower
cost. If we later decide that dedicated IR types are needed, replacing the
intrinsics should be easy and we will benefit of having already updated various
passes to deal with the intrinsics.
>
> We took a similar approach when adding matrix support to LLVM and I think
that worked out very well in the end. The implementation upstream generates
equivalent or better code than our earlier implementation using dedicated IR
matrix types, while being simpler and impacting a much smaller area of the
codebase.
>
> An independent issue to discuss is how to generate complex math intrinsics.
> As part of the initial bring-up, I’d propose matching the code Clang
generates for operations on std::complex<> & co to introduce the
complex math intrinsics. This won’t be perfect and will miss cases, but allows
us to deliver initial improvements without requiring extensive updates to
existing libraries or frontends. I don’t think either the intrinsic only or the
complex type variants are inherently more convenient for frontends to emit.
>
> To better illustrate what this approach could look like, I put up a set of
rough patches that introduce a @llvm.complex.multiply intrinsic
(https://reviews.llvm.org/D91347), replace a set of fadd/fsub/fmul instructions
with @llvm.complex.multiply (https://reviews.llvm.org/D91353) and  lower the
intrinsic for FCMLA on AArch64 (https://reviews.llvm.org/D91354). Note that
those are just rough proof-of-concept patches.
>
> Cheers,
> Florian
Hi Florian,

The proposed experimental intrinsics are a difficult detour to accept for
performance reasons. With a complex type, the usual algebraic simplifications
fall out for free (or close to it). Teaching existing optimizations how to
handle the new complex intrinsics seems like a LOT of unnecessary work.

That said, we recently had this same conversation at Simon Moll's native
predication sync-up meeting. Simon had some convincing ways to workaround
predicated intrinsic optimization (e.g. the PredicatedInstruction class). Maybe
we should explore a more generalized solution that would cover complex
intrinsics too?

Digressing a bit, have we ever discussed using a branch to develop something
like complex support? That way we would avoid an experimental intrinsic
implementation, but also not disturb the codebase until the implementation is
complete.

-Cameron
_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Cameron McInally via llvm-dev

2020-Nov-12 21:36 UTC

head link

[llvm-dev] Complex proposal v3 + roundtable agenda

On Thu, Nov 12, 2020 at 2:47 PM Florian Hahn <florian_hahn at apple.com>
wrote:>
>
>
> On Nov 12, 2020, at 18:52, Cameron McInally <cameron.mcinally at
nyu.edu> wrote:
>
> On Thu, Nov 12, 2020 at 12:03 PM Florian Hahn via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>
>
> Hi,
>
> There’s growing interest among our users to make better use of dedicated
hardware instructions for complex math and I would like to re-start the
discussion on the topic. Given that this original thread was started a while ago
apologies if I missed anything already discussed earlier on the list or the
round-table. The original mail is quoted below.
>
> In particular, I’m interested in the AArch64 side of things, like using
FCMLA [1] for complex multiplications to start with.
>
> To get the discussion going, I’d like to share an alternative pitch.
Instead of starting with adding complex types, we could start with adding a set
of intrinsics that operate on complex values packed into vectors instead.
>
> Starting with intrinsics would allow us to bring up the lowering of those
intrinsics to target-specific nodes incrementally without having to make
substantial changes across the codebase, as adding new types would require.
Initially, we could try and match IR patterns that correspond to complex
operations late in the pipeline. We can then work on incrementally moving the
point where the intrinsics are introduced earlier in the pipeline, as we adopt
more passes to deal with them. This way, we won’t have to teach all passes about
complex types at once or risk loosing out all the existing combines on the
corresponding floating point operations.
>
> I think if we introduce a small set of intrinsics for complex math (like
@llvm.complex.multiply) we could use them to improve code-generation in key
passes like the vectorizers and deliver large improvements to our users fairly
quickly. There might be some scenarios which require a dedicated IR type, but I
think we can get a long way with a set of specialized intrinsics at a much lower
cost. If we later decide that dedicated IR types are needed, replacing the
intrinsics should be easy and we will benefit of having already updated various
passes to deal with the intrinsics.
>
> We took a similar approach when adding matrix support to LLVM and I think
that worked out very well in the end. The implementation upstream generates
equivalent or better code than our earlier implementation using dedicated IR
matrix types, while being simpler and impacting a much smaller area of the
codebase.
>
> An independent issue to discuss is how to generate complex math intrinsics.
> As part of the initial bring-up, I’d propose matching the code Clang
generates for operations on std::complex<> & co to introduce the
complex math intrinsics. This won’t be perfect and will miss cases, but allows
us to deliver initial improvements without requiring extensive updates to
existing libraries or frontends. I don’t think either the intrinsic only or the
complex type variants are inherently more convenient for frontends to emit.
>
> To better illustrate what this approach could look like, I put up a set of
rough patches that introduce a @llvm.complex.multiply intrinsic
(https://reviews.llvm.org/D91347), replace a set of fadd/fsub/fmul instructions
with @llvm.complex.multiply (https://reviews.llvm.org/D91353) and  lower the
intrinsic for FCMLA on AArch64 (https://reviews.llvm.org/D91354). Note that
those are just rough proof-of-concept patches.
>
> Cheers,
> Florian
>
>
>
> The proposed experimental intrinsics are a difficult detour to accept
> for performance reasons. With a complex type, the usual algebraic
> simplifications fall out for free (or close to it). Teaching existing
> optimizations how to handle the new complex intrinsics seems like a
> LOT of unnecessary work.
>
>
> Thanks for taking a look!
>
> Could you expand a bit more on what kind of unnecessary work you expect? I
would expect most of the code to deal with intrinsics to be easily migrated
once/if we decide to switch to a dedicated type.
>
> Concretely, for the lowering code, it should hopefully just boil down to
updating the patterns that get matched in the backends (matching the complex
multiply instruction instead of @llvm.complex.multiply). For supporting complex
math in the vectorizers, we have to add support for cost-modeling and support
widening the intrinsics. Again, wouldn’t’ changing from intrinsics to a type
just mean adjusting from dealing with intrinsic calls to the corresponding
instructions?
>
> There certainly is some pieces around the edges that will need adjusting or
become obsolete, but I would hope/expect the majority of the work to be
re-usable.
>
> As for getting the usual algebraic simplifications for free, even with a
new type I suppose we would have to teach instcombine/InstructionSimplify about
them. This is indeed an area where a dedicated type is probably quite a bit
easier to deal with. But I think part of the vector-predication proposal
includes some generic matchers for the different intrinsic, which should be
relatively straight-forward to update as well.
Yes, the IR passes are where I see complex types win. For example,
pretty much all of InstCombine's visitFMul(...) transformations should
not depend on type.*** So there's no need to duplicate all that code
for @llvm.complex.multiply.

It's true that teaching the pattern matcher to recognize intrinsics
gets us a long way. But we'd also need to update IR passes in places
where the Opcode is explicitly checked. E.g.:

```
  switch (Opcode) {
  <snipped>
  case Instruction::FAdd: <== ***HERE***
    return SimplifyFAddInst(LHS, RHS, FastMathFlags(), Q, MaxRecurse);
```

And when a transformation succeeds, we'd need to update how the result
is returned. E.g.:

```
  // (-X) + Y --> Y - X
  Value *X, *Y;
  if (match(&I, m_c_FAdd(m_FNeg(m_Value(X)), m_Value(Y))))
    return BinaryOperator::CreateFSubFMF(Y, X, &I); <== ***HERE***
```

Both of those come for free with a complex type, along with the
pattern matching code.

I'm not opposed to finding solutions for these problems, but this
pitch has been around for a while now, and it hasn't really taken off
yet. It's hard to rally behind it again.

(***) Let me point out that my number theory and abstract algebra are
*rusty*. So don't be fooled into thinking my initial assumptions about
complex numbers are correct.

> I’ll admit that the scope of my pitch is much more limited than the
original proposal and very focused on allowing LLVM to use specialized
instructions. But it should be relatively simple to implement (without impacting
anything unrelated to the passes/backends that are adjusted) and easy to extend
to meet additional needs by other people & backends. It also allows to reuse
the existing instructions for insert/extract/shuffles and the corresponding
folds.
Generating the special instructions is good, but not if it costs other
optimizations. Let's get both. :D

But in all seriousness, my opinion on this topic has hit stark
resistance before. I won't harp on it if others don't feel the same.
It's just pretty clear to me that experimental intrinsics are a major
hurdle to performant code for larger projects.

Simon Moll via llvm-dev

2020-Nov-13 10:13 UTC

head link

[llvm-dev] Complex proposal v3 + roundtable agenda

Hi,

On 11/12/20 7:53 PM, Cameron McInally via llvm-dev
wrote:> On Thu, Nov 12, 2020 at 12:03 PM Florian Hahn via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
>> Hi,
>>
>> There’s growing interest among our users to make better use of
dedicated hardware instructions for complex math and I would like to re-start
the discussion on the topic. Given that this original thread was started a while
ago apologies if I missed anything already discussed earlier on the list or the
round-table. The original mail is quoted below.
>>
>> In particular, I’m interested in the AArch64 side of things, like using
FCMLA [1] for complex multiplications to start with.
>>
>> To get the discussion going, I’d like to share an alternative pitch.
Instead of starting with adding complex types, we could start with adding a set
of intrinsics that operate on complex values packed into vectors instead.
>>
>> Starting with intrinsics would allow us to bring up the lowering of
those intrinsics to target-specific nodes incrementally without having to make
substantial changes across the codebase, as adding new types would require.
Initially, we could try and match IR patterns that correspond to complex
operations late in the pipeline. We can then work on incrementally moving the
point where the intrinsics are introduced earlier in the pipeline, as we adopt
more passes to deal with them. This way, we won’t have to teach all passes about
complex types at once or risk loosing out all the existing combines on the
corresponding floating point operations.
>>
>> I think if we introduce a small set of intrinsics for complex math
(like @llvm.complex.multiply) we could use them to improve code-generation in
key passes like the vectorizers and deliver large improvements to our users
fairly quickly. There might be some scenarios which require a dedicated IR type,
but I think we can get a long way with a set of specialized intrinsics at a much
lower cost. If we later decide that dedicated IR types are needed, replacing the
intrinsics should be easy and we will benefit of having already updated various
passes to deal with the intrinsics.
>>
>> We took a similar approach when adding matrix support to LLVM and I
think that worked out very well in the end. The implementation upstream
generates equivalent or better code than our earlier implementation using
dedicated IR matrix types, while being simpler and impacting a much smaller area
of the codebase.
>>
>> An independent issue to discuss is how to generate complex math
intrinsics.
>> As part of the initial bring-up, I’d propose matching the code Clang
generates for operations on std::complex<> & co to introduce the
complex math intrinsics. This won’t be perfect and will miss cases, but allows
us to deliver initial improvements without requiring extensive updates to
existing libraries or frontends. I don’t think either the intrinsic only or the
complex type variants are inherently more convenient for frontends to emit.
>>
>> To better illustrate what this approach could look like, I put up a set
of rough patches that introduce a @llvm.complex.multiply intrinsic
(https://reviews.llvm.org/D91347), replace a set of fadd/fsub/fmul instructions
with @llvm.complex.multiply (https://reviews.llvm.org/D91353) and  lower the
intrinsic for FCMLA on AArch64 (https://reviews.llvm.org/D91354). Note that
those are just rough proof-of-concept patches.
>>
>> Cheers,
>> Florian
> Hi Florian,
>
> The proposed experimental intrinsics are a difficult detour to accept
> for performance reasons. With a complex type, the usual algebraic
> simplifications fall out for free (or close to it). Teaching existing
> optimizations how to handle the new complex intrinsics seems like a
> LOT of unnecessary work.
>
> That said, we recently had this same conversation at Simon Moll's
> native predication sync-up meeting. Simon had some convincing ways to
> workaround predicated intrinsic optimization (e.g. the
> PredicatedInstruction class). Maybe we should explore a more
> generalized solution that would cover complex intrinsics too?The generalized pattern matching in the VP reference patch is not
VP-specific, eg it is parameter-ized in the abstraction. That means we
can lift InstCombine,InstSimplify once on top of that abstraction and
than instantiate that (it's literally a template parameter) to (0)
regular LLVM instructions, (1) constrained fp intrinsics, (2) complex
intrinsics, (3) VP.. hypothetically even (4) constrained/complex/vp
intrinsics.

I'll send out a separate RFC on how that generalized pattern match works
- it's about time we get working on this since use cases keep piling up..

- Simon>
> Digressing a bit, have we ever discussed using a branch to develop
> something like complex support? That way we would avoid an
> experimental intrinsic implementation, but also not disturb the
> codebase until the implementation is complete.
>
> -Cameron
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>

David Greene via llvm-dev

2020-Nov-13 14:49 UTC

head link

[llvm-dev] Complex proposal v3 + roundtable agenda

Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> writes:
> Some architectures have instructions that assist with complex
> arithmetic.  Without intrinsics it may be hard to use such
> instructions especially because of the arithmetic simplifications.
> Perhaps, depending on TTI, those intrinsics could be expanded into the
> explicit arithmetic?
Can you provide some examples of what you mean?

           -David

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Nov 2020 - Complex proposal v3 + roundtable agenda

[llvm-dev] Complex proposal v3 + roundtable agenda

[llvm-dev] Complex proposal v3 + roundtable agenda

[llvm-dev] Complex proposal v3 + roundtable agenda

[llvm-dev] Complex proposal v3 + roundtable agenda

[llvm-dev] Complex proposal v3 + roundtable agenda

[llvm-dev] Complex proposal v3 + roundtable agenda

Apparently Analagous Threads