thr3ads.net - similar to: "[LLVMdev] Extracting a value from an union"

Displaying 20 results from an estimated 20000 matches similar to: "[LLVMdev] Extracting a value from an union"

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 27

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Arnold, thanks for the detailed setup. Still, I haven't figured out the right thing to do. I would need only the native target since all generated code will execute on the JIT execution machine (right now, the old JIT interface). There is no need for other targets. Maybe it would be good to ask specific questions: How do I get the triple for the native target? How do I setup the

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

I would need this to work when calling the vectorizer through the function pass manager. Unfortunately I am having the same problem there: LV: The Widest type: 32 bits. LV: The Widest register is: 32 bits. It's not picking the target information, although I tried with and without the target triple in the module. Any idea what could be wrong? Frank On 26/10/13 15:54, Hal Finkel wrote:

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

>>> LV: The Widest type: 32 bits. >>> LV: The Widest register is: 32 bits. Yep, we don’t pick up the right TTI. Try -march=x86-64 (or leave it out) you already have this info in the triple. Then it should work (does for me with your example below). On Oct 26, 2013, at 2:16 PM, Frank Winter <fwinter at jlab.org> wrote: > Hi Hal! > > I am using the

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 27

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Frank, On Oct 26, 2013, at 6:29 PM, Frank Winter <fwinter at jlab.org> wrote: > I would need this to work when calling the vectorizer through > the function pass manager. Unfortunately I am having the same > problem there: I am not sure which function pass manager you are referring here. I assume you create your own (you are not using opt but configure your own pass

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

----- Original Message ----- > >>> LV: The Widest type: 32 bits. > >>> LV: The Widest register is: 32 bits. > > Yep, we don’t pick up the right TTI. > > Try -march=x86-64 (or leave it out) you already have this info in the > triple. > > Then it should work (does for me with your example below). That may depend on what CPU is picks by default; Frank,

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Hal! I am using the 'x86_64' target. Below the complete module dump and here the command line: opt -march=x64-64 -loop-vectorize -debug-only=loop-vectorize -S test.ll Frank ; ModuleID = 'test.ll' target datalayout = "e-p:64:64:64-S128-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f16:16:16-f32:32:32-f64:64:64-f128:128:128-v64:64:64-v128:12

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Frank, Sent from my iPhone > On Oct 26, 2013, at 10:03 AM, Frank Winter <fwinter at jlab.org> wrote: > > My function implements a simple loop: > > void bar( int start, int end, float* A, float* B, float* C) > { > for (int i=start; i<end;++i) > A[i] = B[i] * C[i]; > } > > This looks pretty much like the standard example. However, I built

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

My function implements a simple loop: void bar( int start, int end, float* A, float* B, float* C) { for (int i=start; i<end;++i) A[i] = B[i] * C[i]; } This looks pretty much like the standard example. However, I built the function with the IRBuilder, thus not coming from C and clang. Also I changed slightly the function's signature: define void @bar([8 x i8]* %arg_ptr) {

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

----- Original Message ----- > Hi Arnold, > > adding '-debug-only=loop-vectorize' to the command gives: > > LV: Checking a loop in "bar" > LV: Found a loop: L0 > LV: Found an induction variable. > LV: Found an unidentified write ptr: %7 = load float** %6 > LV: Found an unidentified read ptr: %10 = load float** %9 > LV: Found an unidentified

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 26

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Arnold, adding '-debug-only=loop-vectorize' to the command gives: LV: Checking a loop in "bar" LV: Found a loop: L0 LV: Found an induction variable. LV: Found an unidentified write ptr: %7 = load float** %6 LV: Found an unidentified read ptr: %10 = load float** %9 LV: Found an unidentified read ptr: %13 = load float** %12 LV: We need to do 2 pointer comparisons. LV: We

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 29

[LLVMdev] Loop vectorizer dosen't find loop bounds

----- Original Message ----- > Thanks for the alternatives! > > I am trying the 'extracting sub-function' approach. However, it seems > I > can't get the 'subfunction' to pass the verifier. This is my > subfunction: > > define void @main_extern([8 x i8]* %arg_ptr) { > entrypoint: > %0 = getelementptr [8 x i8]* %arg_ptr, i32 0 > %1 =

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 29

[LLVMdev] Loop vectorizer dosen't find loop bounds

Thanks for the alternatives! I am trying the 'extracting sub-function' approach. However, it seems I can't get the 'subfunction' to pass the verifier. This is my subfunction: define void @main_extern([8 x i8]* %arg_ptr) { entrypoint: %0 = getelementptr [8 x i8]* %arg_ptr, i32 0 %1 = bitcast [8 x i8]* %0 to i64* %2 = load i64* %1 %3 = getelementptr [8 x i8]*

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

Yes, you need the latest ToT version of llvm or you run -loop-vectorize -earlycse -instcombine -simplifycfg The bitcast essentially is a noop to satisfy the type system. This is how your example looks like for me: vector.body: ; preds = %vector.body, %vector.ph %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ] %.lhs = shl i64 %6, 2

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 28

[LLVMdev] Loop vectorizer dosen't find loop bounds

----- Original Message ----- > Bingo! That works (when coming from C source) > > Now, I have a serious problem. I am not coming from C but I build the > function with the builder. I am also forced to change the signature > and > load the pointers a,b,c afterwards: > > define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 { > entrypoint: > %0 = bitcast [8 x

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The instcombine pass cleans up a lot. Any idea why there are still shufflevector, insertelement, *and* bitcast (!!) etc. instructions left? The original loop is so clean, a textbook example I'd say. There is no need to shuffle anything.At least I don't see it. Frank vector.ph: ; preds = %L5 %broadcast.splatinsert1 = insertelement <4 x

[LLVMdev] Loop vectorizer dosen't find loop bounds

2013 Oct 28

[LLVMdev] Loop vectorizer dosen't find loop bounds

Bingo! That works (when coming from C source) Now, I have a serious problem. I am not coming from C but I build the function with the builder. I am also forced to change the signature and load the pointers a,b,c afterwards: define void @bar([8 x i8]* nocapture readonly %arg_ptr) #0 { entrypoint: %0 = bitcast [8 x i8]* %arg_ptr to i32* %1 = load i32* %0, align 4 %2 = getelementptr [8 x

[LLVMdev] MCJIT generates MOVAPS on unaligned address

2014 Aug 07

[LLVMdev] MCJIT generates MOVAPS on unaligned address

MCJIT when lowering to x86-64 generates a MOVAPS (Move Aligned Packed Single-Precision Floating-Point Values) on a non-aligned memory address: movaps 88(%rdx), %xmm0 where %rdx comes in as a function argument with only natural alignment (float*). This x86 instruction requires the memory address to be 16 byte aligned which 88 plus something aligned to 4 byte isn't. Here the

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

2013 Nov 06

[LLVMdev] loop vectorizer: Unexpected extract/insertelement

The loop vectorizer relies on cleanup passes to be run after it: from Transforms/IPO/PassManagerBuilder.cpp: // Add the various vectorization passes and relevant cleanup passes for // them since we are no longer in the middle of the main scalar pipeline. MPM.add(createLoopVectorizePass(DisableUnrollLoops)); MPM.add(createInstructionCombiningPass());

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

2015 Jul 29

[LLVMdev] x86-64 backend generates aligned ADDPS with unaligned address

When I compile attached IR with LLVM 3.6 llc -march=x86-64 -o f.S f.ll it generates an aligned ADDPS with unaligned address. See attached f.S, here an extract: addq $12, %r9 # $12 is not a multiple of 4, thus for xmm0 this is unaligned xorl %esi, %esi .align 16, 0x90 .LBB0_1: # %loop2

[LLVMdev] loop vectorizer says Bad stride

2013 Oct 28

[LLVMdev] loop vectorizer says Bad stride

Frank, It looks like the loop vectorizer is unable to tell that the two stores in your code never overlap. This is probably because of the sign-extend in your code. Can you extend the indices to 64bit ? Thanks, Nadav On Oct 28, 2013, at 1:38 PM, Frank Winter <fwinter at jlab.org> wrote: > Verifying function > running passes ... > LV: Checking a loop in "bar" > LV:

similar to: [LLVMdev] Extracting a value from an union