thr3ads.net - similar to: "[LLVMdev] Simple Loop Vectorize Question"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Simple Loop Vectorize Question"

[LLVMdev] Simple Loop Vectorize Question

2013 May 09

[LLVMdev] Simple Loop Vectorize Question

Hi Josh, Your modules does not have a triple, so the target machine and TargetTransformInfo have no way of knowing if you are running on a machine with vector registers. Try adding the '-mcpu=XXXX' to opt and see what happens. Thanks, Nadav On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com> wrote: > Hi! I am trying to get the loop vectorizer to work on a

[LLVMdev] Simple Loop Vectorize Question

2013 May 10

[LLVMdev] Simple Loop Vectorize Question

Nadav, Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S -debug double.ll' doesn't appear to make a difference. In fact it seems to be ignored as garbage values for -mcpu don't raise an error. Am I overlooking something else also? Many Thanks, Josh On Thu, May 9, 2013 at 6:06 PM, Nadav Rotem <nrotem at apple.com> wrote: > Hi Josh, > > Your

[LLVMdev] Simple Loop Vectorize Question

2013 May 10

[LLVMdev] Simple Loop Vectorize Question

Hi Josh, This line works for me: opt file.ll -loop-vectorize -S -o - -mtriple=x86_64 -mcpu=corei7-avx -debug You need to specify the triple on the command line if it is not inside the module. Thanks, Nadav On May 9, 2013, at 5:53 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! > > I think

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Something like: index 6db7f68..68564cb 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr Type *DataTy = VectorType::get(ScalarDataTy, VF); Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand(); unsigned Alignment = LI ?

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Nadav, I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere. v/r, Josh [1]

LoopStrengthReduction generates false code

2020 Jun 09

LoopStrengthReduction generates false code

Hi. In my backend I get false code after using StrengthLoopReduction. In the generated code the loop index variable is multiplied by 8 (correct, everything is 64 bit aligned) to get an address offset, and the index variable is incremented by 1*8, which is not correct. It should be incremented by 1 only. The factor 8 appears again. I compared the debug output

LoopStrengthReduction generates false code

2020 Jun 09

LoopStrengthReduction generates false code

Hm, no. I expect byte addresses - everywhere. The compiler should not know that the arch needs word addresses. During lowering LOAD and STORE get explicit conversion operations for the memory address. Even if my arch was byte addressed the code would be false/illegal. Boris > Am 09.06.2020 um 19:36 schrieb Eli Friedman <efriedma at quicinc.com>: > > Blindly guessing here,

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

----- Original Message ----- > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Yes, I was just about to send out: DL->getABITypeAlignment(ScalarDataTy); The question is: “… ABI alignment for the target …" is that getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Arnold Schwaighofer"

LoopStrengthReduction generates false code

2020 Jun 10

LoopStrengthReduction generates false code

The IR after LSR is: *** IR Dump After Loop Strength Reduction *** ; Preheader: entry: tail call void @fill_array(i32* getelementptr inbounds ([10 x i32], [10 x i32]* @buffer, i32 0, i32 0)) #2 br label %while.body ; Loop: while.body: ; preds = %while.body, %entry %lsr.iv = phi i32 [ %lsr.iv.next, %while.body ], [ 0, %entry ] %uglygep = getelementptr

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! -Josh On Fri, Nov 15, 2013 at 3:20 PM, Frank Winter <fwinter at jlab.org> wrote: > Hmm.. I don't quite understand. How can a module validator > catch this, when it's the

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

A fix for this is in r194876. Thanks for reporting this! On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 27

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Arnold, thanks for the detailed setup. Still, I haven't figured out the right thing to do. I would need only the native target since all generated code will execute on the JIT execution machine (right now, the old JIT interface). There is no need for other targets. Maybe it would be good to ask specific questions: How do I get the triple for the native target? How do I setup the

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 19

[LLVMdev] Limit loop vectorizer to SSE

On 16/11/2013 7:58 AM, Nadav Rotem wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org > <mailto:renato.golin at linaro.org>> wrote: > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com >> <mailto:josh.klontz at gmail.com>> wrote: >> >> Agreed, is there a pass that will insert a

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, > what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() > so I don't have to hard code 32? Thanks! > I think that's a fair question, and it's about safety. If you're getting this on the

[LLVMdev] Why is the loop vectorizer not working on my function?

2013 Oct 27

[LLVMdev] Why is the loop vectorizer not working on my function?

Hi Frank, On Oct 26, 2013, at 6:29 PM, Frank Winter <fwinter at jlab.org> wrote: > I would need this to work when calling the vectorizer through > the function pass manager. Unfortunately I am having the same > problem there: I am not sure which function pass manager you are referring here. I assume you create your own (you are not using opt but configure your own pass

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

I confirm that r194876 fixes the issue, i.e. segfault not caused. My program still passed 16 byte aligned pointers to the function which the loop vectorizer processes successfully: LV: Vector loop of width 8 costs: 1. LV: Selecting VF = : 8. LV: Found a vectorizable loop (8) in func_orig.ll LV: Unroll Factor is 1 Since the program runs fine, it seems to be allowed for the CPU to issue a vector

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

The vectorizer will now emit = load <8 x i32>, align #TargetAlignmentOfScalari32 where before it would emit = load <8 x i32> (which has the semantics of “= load <8 xi32>, align 0” which means the address is aligned with target abi alignment, see http://llvm.org/docs/LangRef.html#load-instruction). When the backend generates code for the former it will emit an unaligned move:

[LLVMdev] Vectorization factor limitation in Loop Vectorizer

2014 Dec 11

[LLVMdev] Vectorization factor limitation in Loop Vectorizer

Hi Nadav/Devs I am exploring Loop Vectorizer to vectorize i8 scalar operations into 8xi8 vector operation. I was expecting the Loop Vectorizer to analyze the profitability for vectorization factor(VF) of 8, However it is not doing so due to the widest type calculation done for the blocks inside the loop. May be I am missing something, however, I am curious to know why Loop Vectorizer limits the

similar to: [LLVMdev] Simple Loop Vectorize Question