similar to: [LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory

Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory"

2011 Sep 06
0
[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory
On Tue, Sep 6, 2011 at 4:37 PM, Matt Pharr <matt.pharr at gmail.com> wrote: > I'm seeing some behavior that surprised me in writing an <8 x i1> vector to memory and reading it back.  (Specifically, the surprise is that I didn't get the original value back!).  This happens both with TOT and 2.9.  This program illustrates the issue: > > define i32 @foo() { >  %c =
2016 May 24
5
Liveness of AL, AH and AX in x86 backend
I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario. For example, I wrote this piece: typedef struct { char x, y; } struct_t; struct_t z; struct_t foo(char *p) { struct_t s; s.x = *p++; s.y = *p; z = s; s.x++; return s; } But the output at -O2
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Try using x86 mode rather than Intel64 mode. I have definitely gotten it to use both ah and al in 32 bit x86 code generation. In particular, I have seen that in loops for both the spec2000 and spec2006 versions of bzip. It can happen, but it does only rarely. Kevin Smith >-----Original Message----- >From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of >Krzysztof
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
Enabling subreg liveness tracking didn't do anything. By altering the allocation order I managed to get the backend to use CL/CH for the struct, but the stores were still separate (even though storing CX would be correct)... Here's another question that falls into the same category: The function X86InstrInfo::loadRegFromStackSlot does not append any implicit uses/defs. How does it
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Hi Krzysztof, > On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario. > > For example, I wrote this piece: > > typedef struct { > char x,
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
On several variants of x86 processors, mixing `ah`, `al` and `ax` as source/destination in the same dependency chain will have some penalties, so for THOSE processors, there is a benefit to NOT use `al` and `ah` to reflect parts of `ax` - I believe this is caused by the fact that the processor doesn't ACTUALLY see these as parts of a bigger register internally, and will execute two independent
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
Hi, Could you use "MIR" to forge the example you're looking for? -- Mehdi > On May 24, 2016, at 10:10 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same. > > As per the subject, I'm not really interested in the
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same. As per the subject, I'm not really interested in the quality of the final code, but in the way that the x86 target deals with the structural relationship between these registers. Specifically, I'd like to see if it would generate implicit defs/uses for AX on defs/uses of
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Here's some of the generated code from the current community head for bzip2.c from spec 256.bzip2, with these options: clang -m32 -S -O2 bzip2.c .LBB14_4: # %bsW.exit24 subl %eax, %ebx addl $8, %eax movl %ebx, %ecx movl %eax, bsLive shll %cl, %edi movl %ebp, %ecx orl %esi, %edi
2016 May 24
1
Liveness of AL, AH and AX in x86 backend
Thanks Kevin. This isn't exactly what I'm looking for, though. The ECX is explicitly defined here and CL/CH are only used. I was interested in the opposite situation---where the sub-registers are defined separately and then the super-register is used as a whole. Hopefully the sub-register liveness tracking is what I need, so the questions about x86 may become moot. -Krzysztof
2016 Jan 30
2
Redundant promotion of integer values in x86 target
Hello, While looking at some internal benchmarks, I found that llvm generates codes with redundant promotion, something like: xor %al, %cl movzbl %cl, %ecx cmp $0x20, %ecx I believe that the promotion stems from the logic in X86TargetLowering::EmitCmp. Comments in the code says, "Do the comparison at i32 if it's smaller, besides the Atom case. This avoids subregister aliasing issues.
2016 Feb 01
2
Redundant promotion of integer values in x86 target
Sanjay, Kevin, Thank you for your reply. Kevin, I wonder if you are still working on it and have a plan to submit your changes for the review. Thanks, Taewook From: "Smith, Kevin B" <kevin.b.smith at intel.com<mailto:kevin.b.smith at intel.com>> Date: Monday, February 1, 2016 at 3:30 PM To: 'Sanjay Patel' <spatel at rotateright.com<mailto:spatel at
2012 Dec 18
2
[LLVMdev] Getting rid of tabs in LLVM's assembly output?
On Tue, Dec 18, 2012 at 11:36 AM, Caldarale, Charles R <Chuck.Caldarale at unisys.com> wrote: >> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] >> On Behalf Of Eli Bendersky >> Subject: [LLVMdev] Getting rid of tabs in LLVM's assembly output? > >> Problem: I then get tabs in my tests, which are discouraged by LLVM's >>
2015 Jul 27
3
[LLVMdev] i1* function argument on x86-64
I am running into a problem with 'i1*' as a function's argument which seems to have appeared since I switched to LLVM 3.6 (but can have other source, of course). If I look at the assembler that the MCJIT generates for an x86-64 target I see that the array 'i1*' is taken as a sequence of 1 bit wide elements. (I guess that's correct). However, I used to call the function
2016 Jan 31
1
Redundant promotion of integer values in x86 target
Hi Taewook - There's a discussion about the underlying x86 micro-arch details here: http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/167221 The conclusion was that we should change how we currently handle these, but we don't want to regress the case that was addressed by: http://reviews.llvm.org/rL195496 There are open bugs with more discussion related to this:
2012 Jan 02
2
[LLVMdev] Transforming wide integer computations back to vector computations
It seems that one of the optimization passes (it seems to be SROA) sometimes transforms computations on vectors of ints to computations on wide integer types; for example, I'm seeing code like the following after optimizations(*): %0 = bitcast <16 x i8> %float2uint to i128 %1 = shl i128 %0, 8 %ins = or i128 %1, 255 %2 = bitcast i128 %ins to <16 x i8> The back end I'm
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes: - patch v2: - Adapt patch to work post KPTI and compiler changes - Redo all performance testing with latest configs and compilers - Simplify mov macro on PIE (MOVABS now) - Reduce GOT footprint - patch v1: - Simplify ftrace implementation. - Use gcc mstack-protector-guard-reg=%gs with PIE when possible. - rfc v3: - Use --emit-relocs instead of -pie to reduce
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below the top 2G of the virtual address space. It allows to optionally extend the KASLR randomization range from 1G to 3G. Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler changes, PIE support and KASLR in general. Thanks to
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below the top 2G of the virtual address space. It allows to optionally extend the KASLR randomization range from 1G to 3G. Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler changes, PIE support and KASLR in general. Thanks to