Displaying 20 results from an estimated 3730 matches for "i64s".
Did you mean:
i64
2008 Feb 19
4
[LLVMdev] 2008-01-25-ByValReadNone.c Failure
Hi all,
I'm seeing this failure on my PPC G4 box running TOT with llvm-gcc
4.2. Is anyone else seeing this? I'm sure it's related to the byval
stuff that's recently gone into LLVM. I'm attaching the output of
this command:
$ llvm-gcc -emit-llvm -O3 -S -o - -emit-llvm /Users/wendling/llvm/
llvm.src/test/CFrontend/2008-01-25-ByValReadNone.c
As you can see in it, there
2013 Nov 10
3
[LLVMdev] loop vectorizer erroneously finds 256 bit vectors
The loop vectorizer is doing an amazing job so far. Most of the time.
I just came across one function which led to unexpected behavior:
On this function the loop vectorizer finds a 256 bit vector as the
wides vector type for the x86-64 architecture. (!)
This is strange, as it was always finding the correct size of 128 bit
as the widest type. I isolated the IR of the function to check if this
is
2013 Nov 10
0
[LLVMdev] loop vectorizer erroneously finds 256 bit vectors
I looked more into this. For the previously sent IR the vector width of
256 bit is found mistakenly (and reproducibly) on this hardware:
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
For the same IR the loop vectorizer finds the correct vector width (128
bit) on:
model name : Intel(R) Xeon(R) CPU E5630 @ 2.53GHz
model name : Intel(R) Core(TM) i7 CPU M 640 @
2013 Nov 10
2
[LLVMdev] loop vectorizer erroneously finds 256 bit vectors
Hi Frank,
I'm not an Intel expert, but it seems that your Xeon E5 supports AVX, which
does have 256-bit vectors. The other two only supports SSE instructions,
which are only 128-bit long.
cheers,
--renato
On 10 November 2013 06:05, Frank Winter <fwinter at jlab.org> wrote:
> I looked more into this. For the previously sent IR the vector width of
> 256 bit is found mistakenly
2016 Aug 02
2
Instruction selection problems due to SelectionDAGBuilder
Hello.
I'm having problems at instruction selection with my back end with the following
basic-block due to a vector add with immediate constant vector (obtained by vectorizing a
simple C program doing vector sum map):
vector.ph: ; preds = %vector.memcheck50
%.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0
2016 Oct 06
2
LoopVectorizer -- generating bad and unhandled shufflevector sequence
Hi,
I have experimented with enabling the LoopVectorizer for SystemZ. I have
come across a loop which, when vectorized, seems to have been poorly
generated. In short, there seems to be a completely unnecessary sequence
of shufflevector instructions, that doesn't get optimized away anywhere.
In other words, there is a shuffling so that leads back to the original
vector:
[0 1 2 3
2013 Nov 10
0
[LLVMdev] loop vectorizer erroneously finds 256 bit vectors
Hi Renato,
you are right! There is 'avx' support:
fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb
rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology
nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est
tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt aes xsave
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The following IR implements the following nested loop:
for (int i = start ; i < end ; ++i )
for (int p = 0 ; p < 4 ; ++p )
a[i*4+p] = b[i*4+p] + c[i*4+p];
define void @main(i64 %arg0, i64 %arg1, i1 %arg2, i64 %arg3, float*
noalias %arg4, float* noalias %arg5, float* noalias %arg6) {
entrypoint:
br i1 %arg2, label %L0, label %L1
L0:
2016 Nov 02
3
rotl: undocumented LLVM instruction?
We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed one
of our code generation tests is breaking in 3.9.
The test is:
; RUN: llc < %s -march=xstg | FileCheck %s
define i64 @bclr64(i64 %a, i64 %b) nounwind readnone {
entry:
; CHECK: bclr r1, r0, r1, 64
%sub = sub i64 %b, 1
%shl = shl i64 1, %sub
%xor = xor i64 %shl, -1
%and = and i64 %a, %xor
ret i64
2016 Nov 03
2
rotl: undocumented LLVM instruction?
Is there any way to get it to delay this optimization where it goes from
this:
Initial selection DAG: BB#0 'bclr64:entry'
SelectionDAG has 14 nodes:
t0: ch = EntryToken
t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1
t6: i64 = sub t4, Constant:i64<1>
t7: i64 = shl Constant:i64<1>, t6
2013 Nov 06
0
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The loop vectorizer relies on cleanup passes to be run after it:
from Transforms/IPO/PassManagerBuilder.cpp:
// Add the various vectorization passes and relevant cleanup passes for
// them since we are no longer in the middle of the main scalar pipeline.
MPM.add(createLoopVectorizePass(DisableUnrollLoops));
MPM.add(createInstructionCombiningPass());
2013 Nov 06
2
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
The instcombine pass cleans up a lot.
Any idea why there are still shufflevector, insertelement, *and* bitcast
(!!) etc. instructions left? The original loop is so clean, a textbook
example I'd say. There is no need to shuffle anything.At least I don't
see it.
Frank
vector.ph: ; preds = %L5
%broadcast.splatinsert1 = insertelement <4 x
2011 Nov 02
1
[LLVMdev] [LLVMDev]: UNREACHABLE executed!
Hi, guys!
I write a virtual machine which uses LLVM as back-end code generator. The
following function code causes strange "UNREACHABLE executed!" error:
define void @p1(%1*) {
%2 = call i8* @llvm.stacksave()
%3 = alloca %0
%4 = getelementptr %0* %3, i64 1
%5 = ptrtoint %0* %3 to i64
%6 = ptrtoint %0* %4 to i64
%7 = sub i64 %6, %5
%8 = bitcast %0* %3 to i8*
call void
2016 Nov 03
3
rotl: undocumented LLVM instruction?
Setting the ISD::ROTL to Expand doesn't work? (via SetOperation)
You could also do a Custom hook if that's what you're looking for.
On Thu, Nov 3, 2016 at 5:12 PM, Phil Tomson <phil.a.tomson at gmail.com> wrote:
> ... or perhaps to rephrase:
>
> In 3.9 it seems to be doing a smaller combine much sooner, whereas in 3.6
> it deferred that till later in the
2012 Jul 31
3
[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM
Hi Duncan,
A DragonEgg/GCC-related question: do you know where these strange FRAME
tokens originate from (e.g. %struct.FRAME.matmul)? Compiling simple Fortran
code with DragonEgg:
> cat matmul.f90
subroutine matmul(nx, ny, nz)
implicit none
integer :: nx, ny, nz
real, dimension(nx, ny) :: A
real, dimension(ny, nz) :: B
real, dimension(nx, nz) :: C
integer :: i, j, k
real,
2013 Nov 06
0
[LLVMdev] loop vectorizer: Unexpected extract/insertelement
Yes, you need the latest ToT version of llvm or you run
-loop-vectorize -earlycse -instcombine -simplifycfg
The bitcast essentially is a noop to satisfy the type system.
This is how your example looks like for me:
vector.body: ; preds = %vector.body, %vector.ph
%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%.lhs = shl i64 %6, 2
2017 Oct 13
2
[SelectionDAG] Assertion due to MachineMemOperand flags difference.
Hello,
I've hit an assertion in SelectionDAG where we try to merge 2 loads
that have the same operands but their MMO flags differ. One is
dereferenceable and one is not. I'm not sure what the underlying issue
here is:
1) MDSDNode with the same operands should have the same flags set on
their respective MMO. The fact the flags differ when the
opcode,types,operands and address-space are
2012 Jul 31
0
[LLVMdev] [DragonEgg] Mysterious FRAME coming from gimple to LLVM
According to comment in tree-nested.c, these frames should be only
introduced in case of debug or OpenMP lowering:
/* A subroutine of convert_nonlocal_reference_op. Create a local variable
in the nested function with DECL_VALUE_EXPR set to reference the true
variable in the parent function. This is used both for debug info
and in OpenMP lowering. */
However, in this code example we
2016 Nov 03
2
rotl: undocumented LLVM instruction?
One option may be to prevent the formation of ROTL, if possible, and
then generating rol by hand.
Marking it as "expand" would likely stop the DAG combiner from creating
it. Then you could "preprocess" the selection DAG before the instruction
selection and do the pattern matching yourself.
-Krzysztof
On 11/3/2016 4:24 PM, Phil Tomson via llvm-dev wrote:
> I could try
2014 Dec 26
3
[LLVMdev] Correct usage of `llvm.assume` for loop vectorization alignment?
Using LLVM ToT and Hal's helpful slide deck [1], I've been trying to use
`llvm.assume` to communicate pointer alignment guarantees to vector load
and store instructions. For example, in [2] %5 and %9 are guaranteed to be
32-byte aligned. However, if I run this IR through `opt -O3 -datalayout
-S`, the vectorized loads and stores are still 1-byte aligned [3]. What's
going wrong? Do I