similar to: [LLVMdev] supporting SAD in loop vectorizer

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] supporting SAD in loop vectorizer"

2014 Nov 04
3
[LLVMdev] supporting SAD in loop vectorizer
----- Original Message ----- > From: "Renato Golin" <renato.golin at linaro.org> > To: "Dibyendu Das" <Dibyendu.Das at amd.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 4, 2014 5:23:30 AM > Subject: Re: [LLVMdev] supporting SAD in loop vectorizer > > On 4 November 2014 11:06, Das, Dibyendu <Dibyendu.Das at amd.com> wrote:
2014 Nov 11
3
[LLVMdev] supporting SAD in loop vectorizer
----- Original Message ----- > From: "Dibyendu Das" <Dibyendu.Das at amd.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "Renato Golin" <renato.golin at linaro.org> > Cc: llvmdev at cs.uiuc.edu > Sent: Tuesday, November 4, 2014 12:15:12 PM > Subject: RE: [LLVMdev] supporting SAD in loop vectorizer > > Here's the simple SAD
2014 Nov 11
4
[LLVMdev] supporting SAD in loop vectorizer
----- Original Message ----- > From: "James Molloy" <james at jamesmolloy.co.uk> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Dibyendu Das" <Dibyendu.Das at amd.com>, llvmdev at cs.uiuc.edu > Sent: Tuesday, November 11, 2014 8:21:37 AM > Subject: Re: [LLVMdev] supporting SAD in loop vectorizer > > > If you'd like to
2016 May 28
4
sum elements in the vector
Hi Rail, Below 2 revisions might be of your interest which Detect SAD patterns and emit psadbw instructions on X86.: http://reviews.llvm.org/D14840 http://reviews.llvm.org/D14897 Intrinsics related to absdiff revisons : http://reviews.llvm.org/D10867 http://reviews.llvm.org/D11678 Hope this helps. Regards, Suyog On Sat, May 28, 2016 at 4:20 AM, Rail Shafigulin via llvm-dev < llvm-dev at
2015 Nov 13
2
[RFC] Introducing a vector reduction add instruction.
Hi When a reduction instruction is vectorized in a loop, it will be turned into an instruction with vector operands of the same operation type. This new instruction has a special property that can give us more flexibility during instruction selection later: this operation is valid as long as the reduction of all elements of the result vector is identical to the reduction of all elements of its
2015 Nov 19
5
[RFC] Introducing a vector reduction add instruction.
After some attempt to implement reduce-add in LLVM, I found out a easier way to detect reduce-add without introducing new IR operations. The basic idea is annotating phi node instead of add (so that it is easier to handle other reduction operations). In PHINode class, we can add a flag indicating if the phi node is a reduction one (the flag can be set in loop vectorizer for vectorized phi nodes).
2015 Nov 25
2
[RFC] Introducing a vector reduction add instruction.
On Wed, Nov 25, 2015 at 2:32 PM, Hal Finkel <hfinkel at anl.gov> wrote: > Hi Cong, > > After reading the original RFC and this update, I'm still not entirely sure I understand the semantics of the flag you're proposing to add. Does it having something to do with the ordering of the reduction operations? The flag is only useful for vectorized reduction for now. I'll give
2015 Nov 25
2
[RFC] Introducing a vector reduction add instruction.
----- Original Message ----- > From: "Xinliang David Li" <davidxl at google.com> > To: "Cong Hou" <congh at google.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Wednesday, November 25, 2015 5:17:58 PM > Subject: Re: [llvm-dev] [RFC] Introducing a vector reduction add
2015 Apr 02
2
[LLVMdev] How to enable use of 64bit load/store for 32bit architecture
Hi James, Jim If you *really* want this to work in selection DAG then there is a solution, but its not pretty. First make i64 not be legal. Then, assuming the regclass you gave has some subregs, you can give load/store a custom legalisation where you change the i64 to MVT::Untyped. So something like this for ISD::STORE: SDValue ValueToBeStored = St.getOperand(…) auto SeqOps[] = {
2015 Oct 14
4
Extending SLP Vectorizer to deal with aggregates?
I'm looking for a sanity check on extending SLP Vectorizer to deal with aggregates. I'd like to vectorize Julia tuple operations. The Julia compiler lowers tuples to LLVM arrays, not LLVM vectors. I've tried making Julia lower tuples to LLVM vectors, but that hurt performance when SLP Vectorizer was not applicable, because of extraction/insertion overhead. I.e., the Julia lowering
2016 May 27
0
sum elements in the vector
Hi Shahid. Do you mind providing a concrete example of X86 code where an intrinsic was added (preferrable with filenames and line numbers)? I'm having difficulty tracking down the steps you provided. Any help is appreciated. On Mon, Apr 4, 2016 at 9:02 PM, Shahid, Asghar-ahmad < Asghar-ahmad.Shahid at amd.com> wrote: > Hi Rail, > > > > We had done this for generation
2012 Oct 05
2
[LLVMdev] LLVM Loop Vectorizer
----- Original Message ----- > From: "Ramshankar Ramanarayanan" <Ramshankar.Ramanarayanan at amd.com> > To: "Hal Finkel" <hfinkel at anl.gov>, "Dibyendu Das" <Dibyendu.Das at amd.com> > Cc: "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, October 5, 2012 11:00:39 AM > Subject: RE: [LLVMdev]
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
If -simd option is specified opt could do validity checks, dependency analysis and such and recognize that a loop can be executed in parallel and as the -simd option is specified, convert the data types to vector instructions and add the scaling factor to the loop's iterators. Following this there can be an early machine function pass that sets up processor specific value in all of
2012 Oct 05
2
[LLVMdev] LLVM Loop Vectorizer
----- Original Message ----- > From: "Dibyendu Das" <Dibyendu.Das at amd.com> > To: "Nadav Rotem" <nrotem at apple.com>, "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Friday, October 5, 2012 3:59:56 AM > Subject: Re: [LLVMdev] LLVM Loop Vectorizer > > I think we should try to abstract the costs of
2016 Dec 14
4
Enabling scalarized conditional stores in the loop vectorizer
I haven't verified what Matt described is what actually happens, but assuming it is - that is a known issue in the x86 cost model. Vectorizing interleaved memory accesses on x86 was, until recently, disabled by default. It's been enabled since r284779, but the cost model is very conservative, and basically assumes we're going to scalarize interleaved ops. I believe Farhana is working
2016 Dec 15
0
Enabling scalarized conditional stores in the loop vectorizer
Thanks Michael and Dibyendu for doing the experimentation and bringing this up to our attention. It might be the case what Matt described here. I will take a look at it. Farhana From: Michael Kuperstein [mailto:mkuper at google.com] Sent: Wednesday, December 14, 2016 9:56 AM To: Das, Dibyendu <Dibyendu.Das at amd.com>; Aleen, Farhana A <farhana.a.aleen at intel.com> Cc: Matthew
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
Perhaps we can parameterize the size of the vector while vectorizing @ llvm and fix up the loop iterators in a target specific pass. -----Original Message----- From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel Sent: Friday, October 05, 2012 8:30 PM To: Das, Dibyendu Cc: llvmdev at cs.uiuc.edu Mailing List Subject: Re: [LLVMdev] LLVM Loop
2015 Apr 03
2
[LLVMdev] How to enable use of 64bit load/store for 32bit architecture
> On Apr 2, 2015, at 2:07 PM, Tom Stellard <tom at stellard.net> wrote: > > On Thu, Apr 02, 2015 at 01:35:55PM -0700, Pete Cooper wrote: >> Hi James, Jim >> >> If you *really* want this to work in selection DAG then there is a solution, but its not pretty. >> >> First make i64 not be legal. Then, assuming the regclass you gave has some subregs, you
2016 Dec 14
0
Enabling scalarized conditional stores in the loop vectorizer
Hi Matt- Yeah I used a pretty recent llvm (post 3.9) on an x86-64 ( both AMD and Intel ). -dibyendu From: Matthew Simpson [mailto:mssimpso at codeaurora.org] Sent: Wednesday, December 14, 2016 10:03 PM To: Das, Dibyendu <Dibyendu.Das at amd.com> Cc: Michael Kuperstein <mkuper at google.com>; llvm-dev at lists.llvm.org Subject: Re: [llvm-dev] Enabling scalarized conditional stores in
2016 Apr 04
7
sum elements in the vector
My target has an instruction that adds up all elements in the vector and stores the result in a register. I'm trying to implement it in my compiler but I'm not sure even where to start. I did look at other targets, but they don't seem to have anything like it ( I could be wrong. My experience with LLVM is limited, so if I missed it, I'd appreciate if someone could point it out ).