thr3ads.net - similar to: "Liveness of AL, AH and AX in x86 backend"

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

Try using x86 mode rather than Intel64 mode. I have definitely gotten it to use both ah and al in 32 bit x86 code generation. In particular, I have seen that in loops for both the spec2000 and spec2006 versions of bzip. It can happen, but it does only rarely. Kevin Smith >-----Original Message----- >From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of >Krzysztof

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

Hi Krzysztof, > On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario. > > For example, I wrote this piece: > > typedef struct { > char x,

Liveness of AL, AH and AX in x86 backend

2016 May 24

3

Liveness of AL, AH and AX in x86 backend

Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... Here's another question that falls into the same category: The function X86InstrInfo::loadRegFromStackSlot does not append any implicit uses/defs. How does it

Liveness of AL, AH and AX in x86 backend

2016 May 24

3

Liveness of AL, AH and AX in x86 backend

On several variants of x86 processors, mixing `ah`, `al` and `ax` as source/destination in the same dependency chain will have some penalties, so for THOSE processors, there is a benefit to NOT use `al` and `ah` to reflect parts of `ax` - I believe this is caused by the fact that the processor doesn't ACTUALLY see these as parts of a bigger register internally, and will execute two independent

Liveness of AL, AH and AX in x86 backend

2016 May 24

3

Liveness of AL, AH and AX in x86 backend

Hi, Could you use "MIR" to forge the example you're looking for? -- Mehdi > On May 24, 2016, at 10:10 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same. > > As per the subject, I'm not really interested in the

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same. As per the subject, I'm not really interested in the quality of the final code, but in the way that the x86 target deals with the structural relationship between these registers. Specifically, I'd like to see if it would generate implicit defs/uses for AX on defs/uses of

Liveness of AL, AH and AX in x86 backend

2016 May 24

0

Liveness of AL, AH and AX in x86 backend

Here's some of the generated code from the current community head for bzip2.c from spec 256.bzip2, with these options: clang -m32 -S -O2 bzip2.c .LBB14_4: # %bsW.exit24 subl %eax, %ebx addl $8, %eax movl %ebx, %ecx movl %eax, bsLive shll %cl, %edi movl %ebp, %ecx orl %esi, %edi

Liveness of AL, AH and AX in x86 backend

2016 May 24

1

Liveness of AL, AH and AX in x86 backend

Thanks Kevin. This isn't exactly what I'm looking for, though. The ECX is explicitly defined here and CL/CH are only used. I was interested in the opposite situation---where the sub-registers are defined separately and then the super-register is used as a whole. Hopefully the sub-register liveness tracking is what I need, so the questions about x86 may become moot. -Krzysztof

Liveness of AL, AH and AX in x86 backend

2016 May 25

0

Liveness of AL, AH and AX in x86 backend

> On May 24, 2016, at 11:01 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote: > > Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... > > Here's another question that falls into the

[LLVMdev] [llvm-commits] rotate

2012 Jul 31

0

[LLVMdev] [llvm-commits] rotate

On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > Andy, > > Here is the left circular shift operator patch. I apologize to the reviewer > in advance. The patch has a good bit of fine detail. Any > comments/criticisms? > > Some caveats... > > 1) This is just the bare minimum needed to make the left circular shift > operator

[LLVMdev] rotate

2012 Jul 29

3

[LLVMdev] rotate

Nice! Clever compiler.. On 07/28/2012 08:55 PM, Michael Gottesman wrote: > I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e., > > ====== > #include<stdlib.h> > #include<stdint.h> > > uint32_t ror(uint32_t input, size_t rot_bits) > { > return (input>>

[LLVMdev] rotate

2012 Jul 31

3

[LLVMdev] rotate

Andy, Here is the left circular shift operator patch. I apologize to the reviewer in advance. The patch has a good bit of fine detail. Any comments/criticisms? Some caveats... 1) This is just the bare minimum needed to make the left circular shift operator work (e.g. no instruction combining). 2) I tried my best to select operator names in the existing style; please feel free to change them as

[PATCH V13 00/14] Paravirtualized ticket spinlocks

2013 Aug 09

1

[PATCH V13 00/14] Paravirtualized ticket spinlocks

From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included against -tip. A separate patchset for KVM host based on kvm tree is already

[PATCH V13 00/14] Paravirtualized ticket spinlocks

2013 Aug 09

1

[PATCH V13 00/14] Paravirtualized ticket spinlocks

From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included against -tip. A separate patchset for KVM host based on kvm tree is already

[PATCH V13 00/14] Paravirtualized ticket spinlocks

2013 Aug 09

1

[PATCH V13 00/14] Paravirtualized ticket spinlocks

From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included against -tip. A separate patchset for KVM host based on kvm tree is already

[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory

2011 Sep 06

4

[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory

I'm seeing some behavior that surprised me in writing an <8 x i1> vector to memory and reading it back. (Specifically, the surprise is that I didn't get the original value back!). This happens both with TOT and 2.9. This program illustrates the issue: define i32 @foo() { %c = alloca <8 x i1> store <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 false, i1

Optimization of successive constant stores

2015 Dec 11

2

Optimization of successive constant stores

Hmm... found an interesting issue: Given: %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1 store i8 2, i8* %3, align 1 %4 = getelementptr inbounds %UodStructType* %0, i32 0, i32 2 store i8 3, i8* %4, align 2 %5 = getelementptr inbounds %UodStructType* %0, i32 0, i32 3

Liveness of AL, AH and AX in x86 backend

2016 May 25

1

Liveness of AL, AH and AX in x86 backend

On 5/25/2016 12:35 PM, Quentin Colombet wrote: > > Doing that would say that we override the other lanes of EAX, which is > not what we want. In what cases, do we need to add those implicit arguments? If you had AL<def> = ... AH<def> = ... ... = AX you'd need implicit uses/defs to define AX. This sort of thing happens on Hexagon very often: general purpose

[PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

2013 Jun 01

11

[PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. Changes in V9: - Changed spin_threshold to 32k to avoid excess halt exits that are causing undercommit degradation (after PLE handler improvement). - Added kvm_irq_delivery_to_apic (suggested by Gleb) - Optimized halt exit

[PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

2013 Jun 01

11

[PATCH RFC V9 0/19] Paravirtualized ticket spinlocks

This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. Changes in V9: - Changed spin_threshold to 32k to avoid excess halt exits that are causing undercommit degradation (after PLE handler improvement). - Added kvm_irq_delivery_to_apic (suggested by Gleb) - Optimized halt exit

similar to: Liveness of AL, AH and AX in x86 backend