Displaying 20 results from an estimated 5000 matches similar to: "[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory"
2011 Sep 06
0
[LLVMdev] Unexpected behavior reading/writing <8 x i1> vector to memory
On Tue, Sep 6, 2011 at 4:37 PM, Matt Pharr <matt.pharr at gmail.com> wrote:
> I'm seeing some behavior that surprised me in writing an <8 x i1> vector to memory and reading it back. (Specifically, the surprise is that I didn't get the original value back!). This happens both with TOT and 2.9. This program illustrates the issue:
>
> define i32 @foo() {
> %c =
2016 May 24
5
Liveness of AL, AH and AX in x86 backend
I'm trying to see how the x86 backend deals with the relationship
between AL, AH and AX, but I can't get it to generate any code that
would expose an interesting scenario.
For example, I wrote this piece:
typedef struct {
char x, y;
} struct_t;
struct_t z;
struct_t foo(char *p) {
struct_t s;
s.x = *p++;
s.y = *p;
z = s;
s.x++;
return s;
}
But the output at -O2
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Try using x86 mode rather than Intel64 mode. I have definitely gotten it to use both ah and al in 32 bit x86 code generation.
In particular, I have seen that in loops for both the spec2000 and spec2006 versions of bzip. It can happen, but it does only rarely.
Kevin Smith
>-----Original Message-----
>From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
>Krzysztof
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
Enabling subreg liveness tracking didn't do anything. By altering the
allocation order I managed to get the backend to use CL/CH for the
struct, but the stores were still separate (even though storing CX would
be correct)...
Here's another question that falls into the same category:
The function X86InstrInfo::loadRegFromStackSlot does not append any
implicit uses/defs. How does it
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Hi Krzysztof,
> On May 24, 2016, at 8:03 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> I'm trying to see how the x86 backend deals with the relationship between AL, AH and AX, but I can't get it to generate any code that would expose an interesting scenario.
>
> For example, I wrote this piece:
>
> typedef struct {
> char x,
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
On several variants of x86 processors, mixing `ah`, `al` and `ax` as
source/destination in the same dependency chain will have some penalties,
so for THOSE processors, there is a benefit to NOT use `al` and `ah` to
reflect parts of `ax` - I believe this is caused by the fact that the
processor doesn't ACTUALLY see these as parts of a bigger register
internally, and will execute two independent
2016 May 24
3
Liveness of AL, AH and AX in x86 backend
Hi,
Could you use "MIR" to forge the example you're looking for?
--
Mehdi
> On May 24, 2016, at 10:10 AM, Krzysztof Parzyszek via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
> Then let me shift focus from performance to size. With either optsize or minsize, the output is still the same.
>
> As per the subject, I'm not really interested in the
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Then let me shift focus from performance to size. With either optsize
or minsize, the output is still the same.
As per the subject, I'm not really interested in the quality of the
final code, but in the way that the x86 target deals with the structural
relationship between these registers. Specifically, I'd like to see if
it would generate implicit defs/uses for AX on defs/uses of
2016 May 24
0
Liveness of AL, AH and AX in x86 backend
Here's some of the generated code from the current community head for bzip2.c from spec 256.bzip2, with these options:
clang -m32 -S -O2 bzip2.c
.LBB14_4: # %bsW.exit24
subl %eax, %ebx
addl $8, %eax
movl %ebx, %ecx
movl %eax, bsLive
shll %cl, %edi
movl %ebp, %ecx
orl %esi, %edi
2016 May 24
1
Liveness of AL, AH and AX in x86 backend
Thanks Kevin. This isn't exactly what I'm looking for, though. The ECX
is explicitly defined here and CL/CH are only used. I was interested in
the opposite situation---where the sub-registers are defined separately
and then the super-register is used as a whole.
Hopefully the sub-register liveness tracking is what I need, so the
questions about x86 may become moot.
-Krzysztof
2016 Jan 30
2
Redundant promotion of integer values in x86 target
Hello,
While looking at some internal benchmarks, I found that llvm generates codes with redundant promotion, something like:
xor %al, %cl
movzbl %cl, %ecx
cmp $0x20, %ecx
I believe that the promotion stems from the logic in X86TargetLowering::EmitCmp. Comments in the code says,
"Do the comparison at i32 if it's smaller, besides the Atom case. This avoids subregister aliasing issues.
2016 Feb 01
2
Redundant promotion of integer values in x86 target
Sanjay, Kevin, Thank you for your reply.
Kevin, I wonder if you are still working on it and have a plan to submit your changes for the review.
Thanks,
Taewook
From: "Smith, Kevin B" <kevin.b.smith at intel.com<mailto:kevin.b.smith at intel.com>>
Date: Monday, February 1, 2016 at 3:30 PM
To: 'Sanjay Patel' <spatel at rotateright.com<mailto:spatel at
2012 Dec 18
2
[LLVMdev] Getting rid of tabs in LLVM's assembly output?
On Tue, Dec 18, 2012 at 11:36 AM, Caldarale, Charles R
<Chuck.Caldarale at unisys.com> wrote:
>> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
>> On Behalf Of Eli Bendersky
>> Subject: [LLVMdev] Getting rid of tabs in LLVM's assembly output?
>
>> Problem: I then get tabs in my tests, which are discouraged by LLVM's
>>
2015 Jul 27
3
[LLVMdev] i1* function argument on x86-64
I am running into a problem with 'i1*' as a function's argument which
seems to have appeared since I switched to LLVM 3.6 (but can have other
source, of course). If I look at the assembler that the MCJIT generates
for an x86-64 target I see that the array 'i1*' is taken as a sequence
of 1 bit wide elements. (I guess that's correct). However, I used to
call the function
2016 Jan 31
1
Redundant promotion of integer values in x86 target
Hi Taewook -
There's a discussion about the underlying x86 micro-arch details here:
http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/167221
The conclusion was that we should change how we currently handle these, but
we don't want to regress the case that was addressed by:
http://reviews.llvm.org/rL195496
There are open bugs with more discussion related to this:
2012 Jan 02
2
[LLVMdev] Transforming wide integer computations back to vector computations
It seems that one of the optimization passes (it seems to be SROA) sometimes transforms computations on vectors of ints to computations on wide integer types; for example, I'm seeing code like the following after optimizations(*):
%0 = bitcast <16 x i8> %float2uint to i128
%1 = shl i128 %0, 8
%ins = or i128 %1, 255
%2 = bitcast i128 %ins to <16 x i8>
The back end I'm
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce
2018 Mar 13
32
[PATCH v2 00/27] x86: PIE support and option to extend KASLR randomization
Changes:
- patch v2:
- Adapt patch to work post KPTI and compiler changes
- Redo all performance testing with latest configs and compilers
- Simplify mov macro on PIE (MOVABS now)
- Reduce GOT footprint
- patch v1:
- Simplify ftrace implementation.
- Use gcc mstack-protector-guard-reg=%gs with PIE when possible.
- rfc v3:
- Use --emit-relocs instead of -pie to reduce
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.
Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to
2017 Oct 04
28
x86: PIE support and option to extend KASLR randomization
These patches make the changes necessary to build the kernel as Position
Independent Executable (PIE) on x86_64. A PIE kernel can be relocated below
the top 2G of the virtual address space. It allows to optionally extend the
KASLR randomization range from 1G to 3G.
Thanks a lot to Ard Biesheuvel & Kees Cook on their feedback on compiler
changes, PIE support and KASLR in general. Thanks to