Craig Topper via llvm-dev
2021-Apr-15 20:50 UTC
[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.
What if we didn't use aligned instructions by default like what PS4 did. And then had a command line option that would "enable alignment exceptions" if someone wants them. Maybe that option should also disable memory folding since memory folding never checks alignment with AVX? Do other targets that have vectors have alignment exceptions like this? We're not obligated to emit code that detects alignment errors. And we already don't if the load gets folded. It seems the problem with the current proposal is that once you have the exception, setting a flag to make it go away is the wrong response. ~Craig On Thu, Apr 15, 2021 at 1:10 PM Reid Kleckner via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Right, I get that this doesn't match what you are doing for PS4, and it > doesn't match what Chen3 Liu proposed. To James's point, the > -fmax-type-align flag is more principled because it powers down all the > other LLVM optimizations that assume aligned pointers have zeros in the low > bits. > > As for how to handle explicit alignment attributes that don't come from > type information, maybe we could revisit that behavior, or conditionalize > it with a flag. I just mean to say that there is prior art for this > direction. We should continue in the direction of a complete solution from > the frontend, rather than adding a workaround in the backend. > > On Thu, Apr 15, 2021 at 11:54 AM <paul.robinson at sony.com> wrote: > >> | This sounds like the -fmax-type-align flag: >> >> >> >> Well, no, at least not for the PS4 case. In our case, the type had an >> alignment attribute but the caller didn’t make sure the allocated memory >> was aligned properly. The -fmax-type-align flag explicitly doesn’t do >> anything in that case, if I’m reading it correctly. (Yes, it’s a bug. >> Yes, sanitizers or other testing could have found it. No, there is no >> opportunity to do any of the things that would have fixed it correctly.) >> >> >> >> Really what we did was effectively this: Pretend X86 doesn’t have a >> VMOVAPS opcode. That’s all. Nothing about memory/operand alignment >> attributes was modified, IR is unchanged. Pretend that one machine opcode >> is missing. Can’t possibly affect anything about IR optimizations, * >> *maybe** something post-ISel would be different but even that is hard to >> imagine. (As best I can remember, the only test updates we had to make >> were to change things like “vmovaps” to “vmov{{u|a}}ps” and done.) It’s >> like we did s/movaps/movups/g on the assembly output. >> >> >> >> I still can’t say I think it should be appropriate to do upstream—no real >> info yet on Intel’s problem case--but I hope this explains why the bigger >> hammer (i.e., get Clang involved) doesn’t seem necessary or appropriate. >> >> --paulr >> >> >> >> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of *Reid >> Kleckner via llvm-dev >> *Sent:* Thursday, April 15, 2021 12:59 PM >> *To:* James Y Knight <jyknight at google.com> >> *Cc:* llvm-dev at lists.llvm.org; Liu, Chen3 <chen3.liu at intel.com>; Luo, >> Yuanke <yuanke.luo at intel.com>; Maslov, Sergey V < >> sergey.v.maslov at intel.com> >> *Subject:* Re: [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx >> machine with option control. >> >> >> >> On Wed, Apr 14, 2021 at 11:58 AM James Y Knight via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> What I suspect you *actually* want here is an option to tell Clang not >> to infer load/store alignments based on object types or alignment >> attributes -- instead treating everything as being potentially aligned to 1 >> unless the allocation is seen (e.g. global/local variables). Clang would >> still need to use the usual alignment computation for variable definitions >> and structure layout, but not memory operations. If clang emits "load ... >> align 1" instructions in LLVM IR, the right thing would then happen in the >> X86 backend automatically. >> >> >> >> This sounds like the -fmax-type-align flag: >> >> https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation >> <https://urldefense.com/v3/__https:/clang.llvm.org/docs/UsersManual.html*controlling-code-generation__;Iw!!JmoZiZGBv3RvKRSx!uoBVF33nyuM5lbseJ-XKanIeYhdhHW9yOoxyF7zJ56FjUs8jsfdUcuw4AQ96FRBrmA$> >> >> Explicit alignment attributes are still honored, so some aligned vector >> instructions may be generated. However, the documentation describes >> essentially this exact use case. >> > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210415/cd7a2454/attachment.html>
via llvm-dev
2021-Apr-15 22:03 UTC
[llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control.
Reid, I’m not clear why anyone would want to “power down” the alignment-aware optimizations? How does that benefit anyone? For example… Let’s postulate a target that has only non-trapping load/store instructions; maybe they go faster on aligned addresses, but don’t trap on unaligned addresses. It has been a few decades but I think VAX worked this way. Would you insist we should power-down the alignment-aware optimizations for this target? Just because the hardware couldn’t require aligned data? I hope not. The conclusion must be, then, that there is no relationship between the existence of trapping/non-trapping instruction behavior for a given target, and how the frontend and middle-end should behave. Therefore, we can’t insist on the front-end slapping “align 1” on everything just because the target doesn’t trap a non-aligned load. Therefore, the choice of trapping/non-trapping instruction behavior in the X86 target specifically, has no necessary relationship to how alignment is thought of in the front-end/middle-end. HTH, --paulr From: Craig Topper <craig.topper at gmail.com> Sent: Thursday, April 15, 2021 4:51 PM To: Reid Kleckner <rnk at google.com> Cc: Robinson, Paul <paul.robinson at sony.com>; Luo, Yuanke <yuanke.luo at intel.com>; Liu, Chen3 <chen3.liu at intel.com>; Maslov, Sergey V <sergey.v.maslov at intel.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control. What if we didn't use aligned instructions by default like what PS4 did. And then had a command line option that would "enable alignment exceptions" if someone wants them. Maybe that option should also disable memory folding since memory folding never checks alignment with AVX? Do other targets that have vectors have alignment exceptions like this? We're not obligated to emit code that detects alignment errors. And we already don't if the load gets folded. It seems the problem with the current proposal is that once you have the exception, setting a flag to make it go away is the wrong response. ~Craig On Thu, Apr 15, 2021 at 1:10 PM Reid Kleckner via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Right, I get that this doesn't match what you are doing for PS4, and it doesn't match what Chen3 Liu proposed. To James's point, the -fmax-type-align flag is more principled because it powers down all the other LLVM optimizations that assume aligned pointers have zeros in the low bits. As for how to handle explicit alignment attributes that don't come from type information, maybe we could revisit that behavior, or conditionalize it with a flag. I just mean to say that there is prior art for this direction. We should continue in the direction of a complete solution from the frontend, rather than adding a workaround in the backend. On Thu, Apr 15, 2021 at 11:54 AM <paul.robinson at sony.com<mailto:paul.robinson at sony.com>> wrote: | This sounds like the -fmax-type-align flag: Well, no, at least not for the PS4 case. In our case, the type had an alignment attribute but the caller didn’t make sure the allocated memory was aligned properly. The -fmax-type-align flag explicitly doesn’t do anything in that case, if I’m reading it correctly. (Yes, it’s a bug. Yes, sanitizers or other testing could have found it. No, there is no opportunity to do any of the things that would have fixed it correctly.) Really what we did was effectively this: Pretend X86 doesn’t have a VMOVAPS opcode. That’s all. Nothing about memory/operand alignment attributes was modified, IR is unchanged. Pretend that one machine opcode is missing. Can’t possibly affect anything about IR optimizations, *maybe* something post-ISel would be different but even that is hard to imagine. (As best I can remember, the only test updates we had to make were to change things like “vmovaps” to “vmov{{u|a}}ps” and done.) It’s like we did s/movaps/movups/g on the assembly output. I still can’t say I think it should be appropriate to do upstream—no real info yet on Intel’s problem case--but I hope this explains why the bigger hammer (i.e., get Clang involved) doesn’t seem necessary or appropriate. --paulr From: llvm-dev <llvm-dev-bounces at lists.llvm.org<mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of Reid Kleckner via llvm-dev Sent: Thursday, April 15, 2021 12:59 PM To: James Y Knight <jyknight at google.com<mailto:jyknight at google.com>> Cc: llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>; Liu, Chen3 <chen3.liu at intel.com<mailto:chen3.liu at intel.com>>; Luo, Yuanke <yuanke.luo at intel.com<mailto:yuanke.luo at intel.com>>; Maslov, Sergey V <sergey.v.maslov at intel.com<mailto:sergey.v.maslov at intel.com>> Subject: Re: [llvm-dev] [RFC] [X86] Emit unaligned vector moves on avx machine with option control. On Wed, Apr 14, 2021 at 11:58 AM James Y Knight via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: What I suspect you actually want here is an option to tell Clang not to infer load/store alignments based on object types or alignment attributes -- instead treating everything as being potentially aligned to 1 unless the allocation is seen (e.g. global/local variables). Clang would still need to use the usual alignment computation for variable definitions and structure layout, but not memory operations. If clang emits "load ... align 1" instructions in LLVM IR, the right thing would then happen in the X86 backend automatically. This sounds like the -fmax-type-align flag: https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation<https://urldefense.com/v3/__https:/clang.llvm.org/docs/UsersManual.html*controlling-code-generation__;Iw!!JmoZiZGBv3RvKRSx!uoBVF33nyuM5lbseJ-XKanIeYhdhHW9yOoxyF7zJ56FjUs8jsfdUcuw4AQ96FRBrmA$> Explicit alignment attributes are still honored, so some aligned vector instructions may be generated. However, the documentation describes essentially this exact use case. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev<https://urldefense.com/v3/__https:/lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev__;!!JmoZiZGBv3RvKRSx!tTmi5Ot5ypUoBSp2e6p8a3o7U86YV49CFHt2_pW2GwCyapgR-cMMoUAeUQxP8A7xBQ$> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210415/f68c1459/attachment.html>