thr3ads.net - similar to: "[LLVMdev] supporting SAD in loop vectorizer"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] supporting SAD in loop vectorizer"

[LLVMdev] supporting SAD in loop vectorizer

2014 Nov 04

[LLVMdev] supporting SAD in loop vectorizer

----- Original Message ----- > From: "Renato Golin" <renato.golin at linaro.org> > To: "Dibyendu Das" <Dibyendu.Das at amd.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 4, 2014 5:23:30 AM > Subject: Re: [LLVMdev] supporting SAD in loop vectorizer > > On 4 November 2014 11:06, Das, Dibyendu <Dibyendu.Das at amd.com> wrote:

[LLVMdev] supporting SAD in loop vectorizer

2014 Nov 11

[LLVMdev] supporting SAD in loop vectorizer

----- Original Message ----- > From: "Dibyendu Das" <Dibyendu.Das at amd.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "Renato Golin" <renato.golin at linaro.org> > Cc: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 4, 2014 12:15:12 PM > Subject: RE: [LLVMdev] supporting SAD in loop vectorizer > > Here's the simple SAD

[LLVMdev] supporting SAD in loop vectorizer

2014 Nov 11

[LLVMdev] supporting SAD in loop vectorizer

----- Original Message ----- > From: "James Molloy" <james at jamesmolloy.co.uk> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Dibyendu Das" <Dibyendu.Das at amd.com>, llvmdev at cs.uiuc.edu > Sent: Tuesday, November 11, 2014 8:21:37 AM > Subject: Re: [LLVMdev] supporting SAD in loop vectorizer > > > If you'd like to

sum elements in the vector

2016 May 28

sum elements in the vector

Hi Rail, Below 2 revisions might be of your interest which Detect SAD patterns and emit psadbw instructions on X86.: http://reviews.llvm.org/D14840 http://reviews.llvm.org/D14897 Intrinsics related to absdiff revisons : http://reviews.llvm.org/D10867 http://reviews.llvm.org/D11678 Hope this helps. Regards, Suyog On Sat, May 28, 2016 at 4:20 AM, Rail Shafigulin via llvm-dev < llvm-dev at

[RFC] Introducing a vector reduction add instruction.

2015 Nov 13

[RFC] Introducing a vector reduction add instruction.

Hi When a reduction instruction is vectorized in a loop, it will be turned into an instruction with vector operands of the same operation type. This new instruction has a special property that can give us more flexibility during instruction selection later: this operation is valid as long as the reduction of all elements of the result vector is identical to the reduction of all elements of its

[RFC] Introducing a vector reduction add instruction.

2015 Nov 19

[RFC] Introducing a vector reduction add instruction.

After some attempt to implement reduce-add in LLVM, I found out a easier way to detect reduce-add without introducing new IR operations. The basic idea is annotating phi node instead of add (so that it is easier to handle other reduction operations). In PHINode class, we can add a flag indicating if the phi node is a reduction one (the flag can be set in loop vectorizer for vectorized phi nodes).

[RFC] Introducing a vector reduction add instruction.

2015 Nov 25

[RFC] Introducing a vector reduction add instruction.

On Wed, Nov 25, 2015 at 2:32 PM, Hal Finkel <hfinkel at anl.gov> wrote: > Hi Cong, > > After reading the original RFC and this update, I'm still not entirely sure I understand the semantics of the flag you're proposing to add. Does it having something to do with the ordering of the reduction operations? The flag is only useful for vectorized reduction for now. I'll give

[RFC] Introducing a vector reduction add instruction.

2015 Nov 25

[RFC] Introducing a vector reduction add instruction.

----- Original Message ----- > From: "Xinliang David Li" <davidxl at google.com> > To: "Cong Hou" <congh at google.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, November 25, 2015 5:17:58 PM > Subject: Re: [llvm-dev] [RFC] Introducing a vector reduction add

[LLVMdev] How to enable use of 64bit load/store for 32bit architecture

2015 Apr 02

[LLVMdev] How to enable use of 64bit load/store for 32bit architecture

Hi James, Jim If you *really* want this to work in selection DAG then there is a solution, but its not pretty. First make i64 not be legal. Then, assuming the regclass you gave has some subregs, you can give load/store a custom legalisation where you change the i64 to MVT::Untyped. So something like this for ISD::STORE: SDValue ValueToBeStored = St.getOperand(…) auto SeqOps[] = {

Extending SLP Vectorizer to deal with aggregates?

2015 Oct 14

Extending SLP Vectorizer to deal with aggregates?

I'm looking for a sanity check on extending SLP Vectorizer to deal with aggregates. I'd like to vectorize Julia tuple operations. The Julia compiler lowers tuples to LLVM arrays, not LLVM vectors. I've tried making Julia lower tuples to LLVM vectors, but that hurt performance when SLP Vectorizer was not applicable, because of extraction/insertion overhead. I.e., the Julia lowering

sum elements in the vector

2016 May 27

sum elements in the vector

Hi Shahid. Do you mind providing a concrete example of X86 code where an intrinsic was added (preferrable with filenames and line numbers)? I'm having difficulty tracking down the steps you provided. Any help is appreciated. On Mon, Apr 4, 2016 at 9:02 PM, Shahid, Asghar-ahmad < Asghar-ahmad.Shahid at amd.com> wrote: > Hi Rail, > > > > We had done this for generation

[LLVMdev] LLVM Loop Vectorizer

2012 Oct 05

[LLVMdev] LLVM Loop Vectorizer

----- Original Message ----- > From: "Ramshankar Ramanarayanan" <Ramshankar.Ramanarayanan at amd.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "Dibyendu Das" <Dibyendu.Das at amd.com> > Cc: "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, October 5, 2012 11:00:39 AM > Subject: RE: [LLVMdev]

[LLVMdev] LLVM Loop Vectorizer

2012 Oct 05

[LLVMdev] LLVM Loop Vectorizer

If -simd option is specified opt could do validity checks, dependency analysis and such and recognize that a loop can be executed in parallel and as the -simd option is specified, convert the data types to vector instructions and add the scaling factor to the loop's iterators. Following this there can be an early machine function pass that sets up processor specific value in all of

[LLVMdev] LLVM Loop Vectorizer

2012 Oct 05

[LLVMdev] LLVM Loop Vectorizer

----- Original Message ----- > From: "Dibyendu Das" <Dibyendu.Das at amd.com> > To: "Nadav Rotem" <nrotem at apple.com>, "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, October 5, 2012 3:59:56 AM > Subject: Re: [LLVMdev] LLVM Loop Vectorizer > > I think we should try to abstract the costs of

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

I haven't verified what Matt described is what actually happens, but assuming it is - that is a known issue in the x86 cost model. Vectorizing interleaved memory accesses on x86 was, until recently, disabled by default. It's been enabled since r284779, but the cost model is very conservative, and basically assumes we're going to scalarize interleaved ops. I believe Farhana is working

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 15

Enabling scalarized conditional stores in the loop vectorizer

Thanks Michael and Dibyendu for doing the experimentation and bringing this up to our attention. It might be the case what Matt described here. I will take a look at it. Farhana From: Michael Kuperstein [mailto:mkuper at google.com] Sent: Wednesday, December 14, 2016 9:56 AM To: Das, Dibyendu <Dibyendu.Das at amd.com>; Aleen, Farhana A <farhana.a.aleen at intel.com> Cc: Matthew

[LLVMdev] LLVM Loop Vectorizer

2012 Oct 05

[LLVMdev] LLVM Loop Vectorizer

Perhaps we can parameterize the size of the vector while vectorizing @ llvm and fix up the loop iterators in a target specific pass. -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Friday, October 05, 2012 8:30 PM To: Das, Dibyendu Cc: llvmdev at cs.uiuc.edu Mailing List Subject: Re: [LLVMdev] LLVM Loop

[LLVMdev] How to enable use of 64bit load/store for 32bit architecture

2015 Apr 03

[LLVMdev] How to enable use of 64bit load/store for 32bit architecture

> On Apr 2, 2015, at 2:07 PM, Tom Stellard <tom at stellard.net> wrote: > > On Thu, Apr 02, 2015 at 01:35:55PM -0700, Pete Cooper wrote: >> Hi James, Jim >> >> If you *really* want this to work in selection DAG then there is a solution, but its not pretty. >> >> First make i64 not be legal. Then, assuming the regclass you gave has some subregs, you

Enabling scalarized conditional stores in the loop vectorizer

2016 Dec 14

Enabling scalarized conditional stores in the loop vectorizer

Hi Matt- Yeah I used a pretty recent llvm (post 3.9) on an x86-64 ( both AMD and Intel ). -dibyendu From: Matthew Simpson [mailto:mssimpso at codeaurora.org] Sent: Wednesday, December 14, 2016 10:03 PM To: Das, Dibyendu <Dibyendu.Das at amd.com> Cc: Michael Kuperstein <mkuper at google.com>; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Enabling scalarized conditional stores in

sum elements in the vector

2016 Apr 04

sum elements in the vector

My target has an instruction that adds up all elements in the vector and stores the result in a register. I'm trying to implement it in my compiler but I'm not sure even where to start. I did look at other targets, but they don't seem to have anything like it ( I could be wrong. My experience with LLVM is limited, so if I missed it, I'd appreciate if someone could point it out ).

similar to: [LLVMdev] supporting SAD in loop vectorizer