thr3ads.net - llvm dev - [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control. [Apr 2021]

If this information is useful, please help other people find it:
Share via:

Reid Kleckner via llvm-dev

2021-Apr-15 16:58 UTC

[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

On Wed, Apr 14, 2021 at 11:58 AM James Y Knight via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> What I suspect you *actually* want here is an option to tell Clang not to
> infer load/store alignments based on object types or alignment attributes
> -- instead treating everything as being potentially aligned to 1 unless the
> allocation is seen (e.g. global/local variables). Clang would still need to
> use the usual alignment computation for variable definitions and structure
> layout, but not memory operations. If clang emits "load ... align
1"
> instructions in LLVM IR, the right thing would then happen in the X86
> backend automatically.
>
This sounds like the -fmax-type-align flag:
https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
Explicit alignment attributes are still honored, so some aligned vector
instructions may be generated. However, the documentation describes
essentially this exact use case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210415/fc06a424/attachment.html>

via llvm-dev

2021-Apr-15 18:54 UTC

head link

[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

| This sounds like the -fmax-type-align flag:

Well, no, at least not for the PS4 case.  In our case, the type had an alignment
attribute but the caller didn’t make sure the allocated memory was aligned
properly.  The -fmax-type-align flag explicitly doesn’t do anything in that
case, if I’m reading it correctly.  (Yes, it’s a bug.  Yes, sanitizers or other
testing could have found it.  No, there is no opportunity to do any of the
things that would have fixed it correctly.)

Really what we did was effectively this:  Pretend X86 doesn’t have a VMOVAPS
opcode.  That’s all.  Nothing about memory/operand alignment attributes was
modified, IR is unchanged.  Pretend that one machine opcode is missing.  Can’t
possibly affect anything about IR optimizations, *maybe* something post-ISel
would be different but even that is hard to imagine.  (As best I can remember,
the only test updates we had to make were to change things like “vmovaps” to
“vmov{{u|a}}ps” and done.)  It’s like we did s/movaps/movups/g on the assembly
output.

I still can’t say I think it should be appropriate to do upstream—no real info
yet on Intel’s problem case--but I hope this explains why the bigger hammer
(i.e., get Clang involved) doesn’t seem necessary or appropriate.
--paulr

From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of Reid
Kleckner via llvm-dev
Sent: Thursday, April 15, 2021 12:59 PM
To: James Y Knight <jyknight at google.com>
Cc: llvm-dev at lists.llvm.org; Liu, Chen3 <chen3.liu at intel.com>; Luo,
Yuanke <yuanke.luo at intel.com>; Maslov, Sergey V <sergey.v.maslov at
intel.com>
Subject: Re: [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine
with option control.

On Wed, Apr 14, 2021 at 11:58 AM James Y Knight via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
What I suspect you actually want here is an option to tell Clang not to infer
load/store alignments based on object types or alignment attributes -- instead
treating everything as being potentially aligned to 1 unless the allocation is
seen (e.g. global/local variables). Clang would still need to use the usual
alignment computation for variable definitions and structure layout, but not
memory operations. If clang emits "load ... align 1" instructions in
LLVM IR, the right thing would then happen in the X86 backend automatically.

This sounds like the -fmax-type-align flag:
https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation<https://urldefense.com/v3/__https:/clang.llvm.org/docs/UsersManual.html*controlling-code-generation__;Iw!!JmoZiZGBv3RvKRSx!uoBVF33nyuM5lbseJ-XKanIeYhdhHW9yOoxyF7zJ56FjUs8jsfdUcuw4AQ96FRBrmA$>
Explicit alignment attributes are still honored, so some aligned vector
instructions may be generated. However, the documentation describes essentially
this exact use case.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210415/80875c4c/attachment.html>

James Y Knight via llvm-dev

2021-Apr-16 21:59 UTC

head link

[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

On Thu, Apr 15, 2021 at 12:58 PM Reid Kleckner <rnk at google.com> wrote:
> On Wed, Apr 14, 2021 at 11:58 AM James Y Knight via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> What I suspect you *actually* want here is an option to tell Clang not
>> to infer load/store alignments based on object types or alignment
>> attributes -- instead treating everything as being potentially aligned
to 1
>> unless the allocation is seen (e.g. global/local variables). Clang
would
>> still need to use the usual alignment computation for variable
definitions
>> and structure layout, but not memory operations. If clang emits
"load ...
>> align 1" instructions in LLVM IR, the right thing would then
happen in the
>> X86 backend automatically.
>>
>
> This sounds like the -fmax-type-align flag:
> https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation
> Explicit alignment attributes are still honored, so some aligned vector
> instructions may be generated. However, the documentation describes
> essentially this exact use case.
>
Wow, thanks! Somehow I've missed that this flag has existed all this time.
ISTM that it would be reasonable to modify -fmax-type-align to override
even an explicit alignment attribute on the type (or typedef).

It looks like -fmax-type-align is barely used in the wild, except that
-fmax-type-align=16 is _default_ for Darwin platforms (since commit
bcd82afad64a22b15000de350d075b10f2de273a
<https://github.com/llvm/llvm-project/commit/bcd82afad64a22b15000de350d075b10f2de273a>).
It's unclear to me what purpose that default is really serving, however,
given that the only types with greater "native" alignment than 16 are
vector types, and typically used vector typedefs already have an alignment
specified, such as `typedef float __m256 __attribute__ ((__vector_size__
(32), __aligned__(32)));`. So the most-commonly-used vector types are
exempted from the effect of the flag, anyways...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20210416/74c54f7e/attachment.html>

llvm dev - Apr 2021 - [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.

[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.