Krzysztof Parzyszek via llvm-dev
2016-May-24 18:01 UTC
[llvm-dev] Liveness of AL, AH and AX in x86 backend
Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... Here's another question that falls into the same category: The function X86InstrInfo::loadRegFromStackSlot does not append any implicit uses/defs. How does it know that it won't need them? If AX was spilled in the middle of a live range of EAX, wouldn't restoring of AX need to implicitly define EAX? We deal with such cases a lot in the Hexagon backend and it continues to be a major pain. I'm trying to understand if there are better options for us. -Krzysztof On 5/24/2016 12:40 PM, Quentin Colombet wrote:> Hi Krzysztof, > >> On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario. >> >> For example, I wrote this piece: >> >> typedef struct { >> char x, y; >> } struct_t; >> >> struct_t z; >> >> struct_t foo(char *p) { >> struct_t s; >> s.x = *p++; >> s.y = *p; >> z = s; >> s.x++; >> return s; >> } >> >> But the output at -O2 is >> >> foo: # @foo >> .cfi_startproc >> # BB#0: # %entry >> movb (%rdi), %al >> movzbl 1(%rdi), %ecx >> movb %al, z(%rip) >> movb %cl, z+1(%rip) >> incb %al >> shll $8, %ecx >> movzbl %al, %eax >> orl %ecx, %eax >> retq >> >> >> I was hoping it would do something along the lines of >> >> movb (%rdi), %al >> movb 1(%rdi), %ah >> movh %ax, z(%rip) >> incb %al >> retq >> >> >> Why is the x86 backend not getting this code? > > Try enabling the sub-register liveness feature. I am guessing we think we cannot use the same register for the low and high part. > Though, I would need to see the machine instrs to be sure. > >> Does it know that AH:AL = AX? > > Yes it does. > > Cheers, > -Quentin >> >> -Krzysztof >> >> >> >> -- >> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Smith, Kevin B via llvm-dev
2016-May-24 18:58 UTC
[llvm-dev] Liveness of AL, AH and AX in x86 backend
I'll try to find the example code from bzip. I ran across this exact situation when doing work on X86FxupBWInsts.cpp. It can happen, but the register allocator doesn't seem to want to do it until it has run out of the low order byte registers. I suspect that this is simply due to the register allocations order preferences, and yes, the upper byte registers definitely were to be avoided in past x86 architectures, and still are, althoug to a lesser extent in current generation x86 architectures. Kevin Smith>-----Original Message----- >From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of >Krzysztof Parzyszek via llvm-dev >Sent: Tuesday, May 24, 2016 11:02 AM >To: Quentin Colombet <qcolombet at apple.com> >Cc: LLVM Dev <llvm-dev at lists.llvm.org> >Subject: Re: [llvm-dev] Liveness of AL, AH and AX in x86 backend > >Enabling subreg liveness tracking didn't do anything. By altering the >allocation order I managed to get the backend to use CL/CH for the >struct, but the stores were still separate (even though storing CX would >be correct)... > >Here's another question that falls into the same category: > >The function X86InstrInfo::loadRegFromStackSlot does not append any >implicit uses/defs. How does it know that it won't need them? If AX >was spilled in the middle of a live range of EAX, wouldn't restoring of >AX need to implicitly define EAX? > >We deal with such cases a lot in the Hexagon backend and it continues to >be a major pain. I'm trying to understand if there are better options >for us. > >-Krzysztof > > > >On 5/24/2016 12:40 PM, Quentin Colombet wrote: >> Hi Krzysztof, >> >>> On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm- >dev at lists.llvm.org> wrote: >>> >>> I'm trying to see how the x86 backend deals with the relationship between >AL, AH and AX, but I can't get it to generate any code that would expose an >interesting scenario. >>> >>> For example, I wrote this piece: >>> >>> typedef struct { >>> char x, y; >>> } struct_t; >>> >>> struct_t z; >>> >>> struct_t foo(char *p) { >>> struct_t s; >>> s.x = *p++; >>> s.y = *p; >>> z = s; >>> s.x++; >>> return s; >>> } >>> >>> But the output at -O2 is >>> >>> foo: # @foo >>> .cfi_startproc >>> # BB#0: # %entry >>> movb (%rdi), %al >>> movzbl 1(%rdi), %ecx >>> movb %al, z(%rip) >>> movb %cl, z+1(%rip) >>> incb %al >>> shll $8, %ecx >>> movzbl %al, %eax >>> orl %ecx, %eax >>> retq >>> >>> >>> I was hoping it would do something along the lines of >>> >>> movb (%rdi), %al >>> movb 1(%rdi), %ah >>> movh %ax, z(%rip) >>> incb %al >>> retq >>> >>> >>> Why is the x86 backend not getting this code? >> >> Try enabling the sub-register liveness feature. I am guessing we think we >cannot use the same register for the low and high part. >> Though, I would need to see the machine instrs to be sure. >> >>> Does it know that AH:AL = AX? >> >> Yes it does. >> >> Cheers, >> -Quentin >>> >>> -Krzysztof >>> >>> >>> >>> -- >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >hosted by The Linux Foundation >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > >-- >Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, >hosted by The Linux Foundation >_______________________________________________ >LLVM Developers mailing list >llvm-dev at lists.llvm.org >http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Quentin Colombet via llvm-dev
2016-May-25 17:35 UTC
[llvm-dev] Liveness of AL, AH and AX in x86 backend
> On May 24, 2016, at 11:01 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote: > > Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... > > Here's another question that falls into the same category: > > The function X86InstrInfo::loadRegFromStackSlot does not append any implicit uses/defs. How does it know that it won't need them? If AX was spilled in the middle of a live range of EAX, wouldn't restoring of AX need to implicitly define EAX?Doing that would say that we override the other lanes of EAX, which is not what we want. In what cases, do we need to add those implicit arguments? Also, IIRC, right now, even with sub register liveness enabled, we would spill the whole register. The subreg liveness only gives us more precise interference, but does not affect splitting or spilling.> > We deal with such cases a lot in the Hexagon backend and it continues to be a major pain. I'm trying to understand if there are better options for us. > > -Krzysztof > > > > On 5/24/2016 12:40 PM, Quentin Colombet wrote: >> Hi Krzysztof, >> >>> On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: >>> >>> I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario. >>> >>> For example, I wrote this piece: >>> >>> typedef struct { >>> char x, y; >>> } struct_t; >>> >>> struct_t z; >>> >>> struct_t foo(char *p) { >>> struct_t s; >>> s.x = *p++; >>> s.y = *p; >>> z = s; >>> s.x++; >>> return s; >>> } >>> >>> But the output at -O2 is >>> >>> foo: # @foo >>> .cfi_startproc >>> # BB#0: # %entry >>> movb (%rdi), %al >>> movzbl 1(%rdi), %ecx >>> movb %al, z(%rip) >>> movb %cl, z+1(%rip) >>> incb %al >>> shll $8, %ecx >>> movzbl %al, %eax >>> orl %ecx, %eax >>> retq >>> >>> >>> I was hoping it would do something along the lines of >>> >>> movb (%rdi), %al >>> movb 1(%rdi), %ah >>> movh %ax, z(%rip) >>> incb %al >>> retq >>> >>> >>> Why is the x86 backend not getting this code? >> >> Try enabling the sub-register liveness feature. I am guessing we think we cannot use the same register for the low and high part. >> Though, I would need to see the machine instrs to be sure. >> >>> Does it know that AH:AL = AX? >> >> Yes it does. >> >> Cheers, >> -Quentin >>> >>> -Krzysztof >>> >>> >>> >>> -- >>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> > > > -- > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160525/d6509beb/attachment.html>
Krzysztof Parzyszek via llvm-dev
2016-May-25 18:44 UTC
[llvm-dev] Liveness of AL, AH and AX in x86 backend
On 5/25/2016 12:35 PM, Quentin Colombet wrote:> > Doing that would say that we override the other lanes of EAX, which is > not what we want. In what cases, do we need to add those implicit arguments?If you had AL<def> = ... AH<def> = ... ... = AX you'd need implicit uses/defs to define AX. This sort of thing happens on Hexagon very often: general purpose registers can be paired into 64-bit registers (and used as a whole in 64-bit instructions) and it is not uncommon that the elements of the pair will be defined individually. In the above case you'd need something like AL<def> = ..., AX<imp-def> AH<def> = ..., AX<imp-def>, AX<imp-use> ... = AX I was trying to replicate a similar situation in the X86 backend to see what it would do. However, this is not needed anymore, because subregister liveness tracking looks very promising in eliminating this problem altogether. Now, if only the anti-dep problem was fixed, things would look peachy... :) -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation