similar to: Liveness of AL, AH and AX in x86 backend

Displaying 20 results from an estimated 4000 matches similar to: "Liveness of AL, AH and AX in x86 backend"

2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Try using x86 mode rather than Intel64 mode. I have definitely gotten it to use both ah and al in 32 bit x86 code generation. In particular, I have seen that in loops for both the spec2000 and spec2006 versions of bzip. It can happen, but it does only rarely. Kevin Smith >-----Original Message----- >From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of >Krzysztof
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Hi Krzysztof, > On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario. > > For example, I wrote this piece: > > typedef struct { > char x,
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... Here's another question that falls into the same category: The function X86InstrInfo::loadRegFromStackSlot does not append any implicit uses/defs. How does it
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
On several variants of x86 processors, mixing `ah`, `al` and `ax` as source/destination in the same dependency chain will have some penalties, so for THOSE processors, there is a benefit to NOT use `al` and `ah` to reflect parts of `ax` - I believe this is caused by the fact that the processor doesn't ACTUALLY see these as parts of a bigger register internally, and will execute two independent
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
Hi, Could you use "MIR" to forge the example you're looking for? -- Mehdi > On May 24, 2016, at 10:10 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same. > > As per the subject, I'm not really interested in the
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same. As per the subject, I'm not really interested in the quality of the final code, but in the way that the x86 target deals with the structural relationship between these registers. Specifically, I'd like to see if it would generate implicit defs/uses for AX on defs/uses of
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Here's some of the generated code from the current community head for bzip2.c from spec 256.bzip2, with these options: clang -m32 -S -O2 bzip2.c .LBB14_4: # %bsW.exit24 subl %eax, %ebx addl $8, %eax movl %ebx, %ecx movl %eax, bsLive shll %cl, %edi movl %ebp, %ecx orl %esi, %edi
2016 May 24
1
Liveness of AL, AH and AX in x86 backend
Thanks Kevin. This isn't exactly what I'm looking for, though. The ECX is explicitly defined here and CL/CH are only used. I was interested in the opposite situation---where the sub-registers are defined separately and then the super-register is used as a whole. Hopefully the sub-register liveness tracking is what I need, so the questions about x86 may become moot. -Krzysztof
2016 May 25
0
Liveness of AL, AH and AX in x86 backend
> On May 24, 2016, at 11:01 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote: > > Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... > > Here's another question that falls into the
2012 Jul 31
0
[LLVMdev] [llvm-commits] rotate
On Tue, Jul 31, 2012 at 8:42 AM, Cameron McInally <cameron.mcinally at nyu.edu> wrote: > Andy, > > Here is the left circular shift operator patch. I apologize to the reviewer > in advance. The patch has a good bit of fine detail. Any > comments/criticisms? > > Some caveats... > > 1) This is just the bare minimum needed to make the left circular shift > operator
2012 Jul 29
3
[LLVMdev] rotate
Nice! Clever compiler.. On 07/28/2012 08:55 PM, Michael Gottesman wrote: > I can get clang/llvm to emit a rotate instruction on x86-64 when compiling C by just using -Os and the rotate from Hacker's Delight i.e., > > ====== > #include<stdlib.h> > #include<stdint.h> > > uint32_t ror(uint32_t input, size_t rot_bits) > { > return (input>>
2012 Jul 31
3
[LLVMdev] rotate
Andy, Here is the left circular shift operator patch. I apologize to the reviewer in advance. The patch has a good bit of fine detail. Any comments/criticisms? Some caveats... 1) This is just the bare minimum needed to make the left circular shift operator work (e.g. no instruction combining). 2) I tried my best to select operator names in the existing style; please feel free to change them as
2013 Aug 09
1
[PATCH V13 00/14] Paravirtualized ticket spinlocks
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included against -tip. A separate patchset for KVM host based on kvm tree is already
2013 Aug 09
1
[PATCH V13 00/14] Paravirtualized ticket spinlocks
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included against -tip. A separate patchset for KVM host based on kvm tree is already
2013 Aug 09
1
[PATCH V13 00/14] Paravirtualized ticket spinlocks
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. The current set of patches are for Xen/x86 spinlock/KVM guest side, to be included against -tip. A separate patchset for KVM host based on kvm tree is already
2011 Sep 06
4
[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory
I'm seeing some behavior that surprised me in writing an <8 x i1> vector to memory and reading it back. (Specifically, the surprise is that I didn't get the original value back!). This happens both with TOT and 2.9. This program illustrates the issue: define i32 @foo() { %c = alloca <8 x i1> store <8 x i1> <i1 true, i1 false, i1 false, i1 false, i1 false, i1
2015 Dec 11
2
Optimization of successive constant stores
Hmm... found an interesting issue: Given: %2 = getelementptr inbounds %UodStructType* %0, i32 0, i32 0 store i8 1, i8* %2, align 8 %3 = getelementptr inbounds %UodStructType* %0, i32 0, i32 1 store i8 2, i8* %3, align 1 %4 = getelementptr inbounds %UodStructType* %0, i32 0, i32 2 store i8 3, i8* %4, align 2 %5 = getelementptr inbounds %UodStructType* %0, i32 0, i32 3
2016 May 25
1
Liveness of AL, AH and AX in x86 backend
On 5/25/2016 12:35 PM, Quentin Colombet wrote: > > Doing that would say that we override the other lanes of EAX, which is > not what we want. In what cases, do we need to add those implicit arguments? If you had AL<def> = ... AH<def> = ... ... = AX you'd need implicit uses/defs to define AX. This sort of thing happens on Hexagon very often: general purpose
2013 Jun 01
11
[PATCH RFC V9 0/19] Paravirtualized ticket spinlocks
This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. Changes in V9: - Changed spin_threshold to 32k to avoid excess halt exits that are causing undercommit degradation (after PLE handler improvement). - Added kvm_irq_delivery_to_apic (suggested by Gleb) - Optimized halt exit
2013 Jun 01
11
[PATCH RFC V9 0/19] Paravirtualized ticket spinlocks
This series replaces the existing paravirtualized spinlock mechanism with a paravirtualized ticketlock mechanism. The series provides implementation for both Xen and KVM. Changes in V9: - Changed spin_threshold to 32k to avoid excess halt exits that are causing undercommit degradation (after PLE handler improvement). - Added kvm_irq_delivery_to_apic (suggested by Gleb) - Optimized halt exit