thr3ads.net - llvm dev - [llvm-dev] Vector ABI mismatch during inlining [Mar 2021]

If this information is useful, please help other people find it:
Share via:

Nikita Popov via llvm-dev

2021-Mar-14 21:09 UTC

[llvm-dev] Vector ABI mismatch during inlining

Hi,

Consider the following code:

target triple = "x86_64-unknown-linux-gnu"

define void @test1() "target-features"="+avx" {
  call void @test2()
  ret void
}

define void @test2() {
  call void @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
  ret void
}

define void @test3(<4 x i64> %arg) noinline {
  ret void
}

Inlining will inline test2 into test1, because test1 has a superset of
target features:

target triple = "x86_64-unknown-linux-gnu"

define void @test1() "target-features"="+avx" {
  call void @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
  ret void
}

define void @test2() {
  call void @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
  ret void
}

define void @test3(<4 x i64> %arg) noinline {
  ret void
}

Now we have a problem: X86 uses different vector ABIs depending on target
features. test3 is compiled without avx, and as such expects the argument
to be passed in two XMM registers. test2 is also compiled without avx and
performs the call correctly. test1 on the other hand is compiled with avx
and will instead pass a single YMM register.

Note that the by-value vector arguments do not necessarily have to be
present in the original code -- even if the frontend passes all vectors by
memory to avoid precisely this issue, argument promotion can convert them
to by-value arguments, as the caller/callee ABIs match at the time argument
promotion runs (between test2 and test3).

I would like to have some input on how this miscompile could be addressed.
The two general options I see are:

1. Fix call lowering to respect callee target features. That is, even if
the caller is +avx, if the callee is not then we should pass arguments via
xmm rather than ymm. I'm not sure to what degree this is possible without
our current infrastructure though.

2. Prevent inlining in this case. I don't think we can't prevent
inlining
across functions with different target features entirely, as that would be
a performance disaster. But possibly we can inspect the body of the callee
to check for calls that would be problematic under the new ABI.

Regards,
Nikita
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210314/119ee7c2/attachment.html>

David Blaikie via llvm-dev

2021-Mar-14 22:23 UTC

head link

[llvm-dev] Vector ABI mismatch during inlining

+echristo who implemented some of this stuff a while back - I don't recall
discussing this particular problem before, though...

(1) seems sort of plausible to me (speaking fairly naively) - do we
correctly annotate function declarations with the attributes today? (eg: if
test3 were defined in some other translation unit - would the declaration
of test3 in this translation unit have the right avx attributes on it?)

On Sun, Mar 14, 2021 at 2:09 PM Nikita Popov via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> Consider the following code:
>
> target triple = "x86_64-unknown-linux-gnu"
>
> define void @test1() "target-features"="+avx" {
>   call void @test2()
>   ret void
> }
>
> define void @test2() {
>   call void @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
>   ret void
> }
>
> define void @test3(<4 x i64> %arg) noinline {
>   ret void
> }
>
> Inlining will inline test2 into test1, because test1 has a superset of
> target features:
>
> target triple = "x86_64-unknown-linux-gnu"
>
> define void @test1() "target-features"="+avx" {
>   call void @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
>   ret void
> }
>
> define void @test2() {
>   call void @test3(<4 x i64> <i64 0, i64 1, i64 2, i64 3>)
>   ret void
> }
>
> define void @test3(<4 x i64> %arg) noinline {
>   ret void
> }
>
> Now we have a problem: X86 uses different vector ABIs depending on target
> features. test3 is compiled without avx, and as such expects the argument
> to be passed in two XMM registers. test2 is also compiled without avx and
> performs the call correctly. test1 on the other hand is compiled with avx
> and will instead pass a single YMM register.
>
> Note that the by-value vector arguments do not necessarily have to be
> present in the original code -- even if the frontend passes all vectors by
> memory to avoid precisely this issue, argument promotion can convert them
> to by-value arguments, as the caller/callee ABIs match at the time argument
> promotion runs (between test2 and test3).
>
> I would like to have some input on how this miscompile could be addressed.
> The two general options I see are:
>
> 1. Fix call lowering to respect callee target features. That is, even if
> the caller is +avx, if the callee is not then we should pass arguments via
> xmm rather than ymm. I'm not sure to what degree this is possible
without
> our current infrastructure though.
>
> 2. Prevent inlining in this case. I don't think we can't prevent
inlining
> across functions with different target features entirely, as that would be
> a performance disaster. But possibly we can inspect the body of the callee
> to check for calls that would be problematic under the new ABI.
>
> Regards,
> Nikita
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210314/cce62088/attachment.html>

llvm dev - Mar 2021 - Vector ABI mismatch during inlining

[llvm-dev] Vector ABI mismatch during inlining

[llvm-dev] Vector ABI mismatch during inlining