Andrew Kelley via llvm-dev
2018-Nov-17 21:09 UTC
[llvm-dev] error: couldn't allocate input reg for constraint '{xmm0}'
Here is some zig code:
pub fn setXmm0(comptime T: type, value: T) void {
comptime assert(builtin.arch == builtin.Arch.x86_64);
const aligned_value: T align(16) = value;
asm volatile (
\\movaps (%[ptr]), %%xmm0
:
: [ptr] "r" (&aligned_value)
: "xmm0"
);
}
I want to improve this and integrate more tightly with LLVM IR, like this:
asm volatile (""
:
: [value] "{xmm0}" (value)
);
Here, this communicates to llvm to make sure xmm0 is set to value, in
whatever way it needs to. Here is the LLVM IR:
call void asm sideeffect "", "{xmm0}"(i128 %1)
But LLVM gives me this error:
error: couldn't allocate input reg for constraint '{xmm0}'
Is this a bug in LLVM or some fundamental limitation?
Andrew Kelley via llvm-dev
2018-Nov-17 21:43 UTC
[llvm-dev] error: couldn't allocate input reg for constraint '{xmm0}'
rkruppe on IRC suggested to try passing <4 x i32> rather than i128,
and that worked. I edited the IR module by hand like this:
%V = bitcast i128 %1 to <4 x i32>
call void asm sideeffect "", "{xmm0}"(<4 x i32> %V),
!dbg !60
This produced the following assembly:
0000000000000030 <setXmm0>:
30: 55 push %rbp
31: 48 89 e5 mov %rsp,%rbp
34: 48 89 7d f0 mov %rdi,-0x10(%rbp)
38: 48 89 75 f8 mov %rsi,-0x8(%rbp)
3c: 0f 10 45 f0 movups -0x10(%rbp),%xmm0
40: 5d pop %rbp
41: c3 retq
I think that's good! My only concern is whether LLVM respected the
alignment requirement of movups instruction. I suppose the calling
convention requires %rbp to be aligned to 16 already, and so
-0x10(%rbp) will be also guaranteed to be 16 bytes aligned?
Regards,
Andrew
On Sat, Nov 17, 2018 at 4:09 PM Andrew Kelley <superjoe30 at gmail.com>
wrote:>
> Here is some zig code:
>
> pub fn setXmm0(comptime T: type, value: T) void {
> comptime assert(builtin.arch == builtin.Arch.x86_64);
> const aligned_value: T align(16) = value;
> asm volatile (
> \\movaps (%[ptr]), %%xmm0
> :
> : [ptr] "r" (&aligned_value)
> : "xmm0"
> );
> }
>
> I want to improve this and integrate more tightly with LLVM IR, like this:
>
> asm volatile (""
> :
> : [value] "{xmm0}" (value)
> );
>
> Here, this communicates to llvm to make sure xmm0 is set to value, in
> whatever way it needs to. Here is the LLVM IR:
>
> call void asm sideeffect "", "{xmm0}"(i128 %1)
>
> But LLVM gives me this error:
> error: couldn't allocate input reg for constraint '{xmm0}'
>
> Is this a bug in LLVM or some fundamental limitation?
Joerg Sonnenberger via llvm-dev
2018-Nov-17 22:00 UTC
[llvm-dev] error: couldn't allocate input reg for constraint '{xmm0}'
On Sat, Nov 17, 2018 at 04:43:15PM -0500, Andrew Kelley via llvm-dev wrote:> I think that's good! My only concern is whether LLVM respected the > alignment requirement of movups instruction. I suppose the calling > convention requires %rbp to be aligned to 16 already, and so > -0x10(%rbp) will be also guaranteed to be 16 bytes aligned?Yes, the calling convention guarantees a fixed stack alignment on entrance. Joerg