Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] LLVM Loop Vectorizer"
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
I think we should try to abstract the costs of instructions of various targets instead of trying to replicate them exactly. The coarser the costing infrastructure the more robust will be the vectorization pass. Also this eliminates/reduces the need of updating the costing infrastructure as and when new h/w reduces the cost(s) of existing instructions.
- Dibyendu
-----Original Message-----
From:
2012 Oct 05
2
[LLVMdev] LLVM Loop Vectorizer
----- Original Message -----
> From: "Dibyendu Das" <Dibyendu.Das at amd.com>
> To: "Nadav Rotem" <nrotem at apple.com>, "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Friday, October 5, 2012 3:59:56 AM
> Subject: Re: [LLVMdev] LLVM Loop Vectorizer
>
> I think we should try to abstract the costs of
2012 Oct 05
1
[LLVMdev] LLVM Loop Vectorizer
Why not just have a hook into the TargetInstrInfo to query for the cost of an instruction? This is already used in many places throughout the optimizers.
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Das, Dibyendu
> Sent: Friday, October 05, 2012 2:00 AM
> To: Nadav Rotem; llvmdev at cs.uiuc.edu Mailing
2012 Oct 05
2
[LLVMdev] LLVM Loop Vectorizer
----- Original Message -----
> From: "Ramshankar Ramanarayanan" <Ramshankar.Ramanarayanan at amd.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Dibyendu Das" <Dibyendu.Das at amd.com>
> Cc: "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Friday, October 5, 2012 11:00:39 AM
> Subject: RE: [LLVMdev]
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
Perhaps we can parameterize the size of the vector while vectorizing @ llvm and fix up the loop iterators in a target specific pass.
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Hal Finkel
Sent: Friday, October 05, 2012 8:30 PM
To: Das, Dibyendu
Cc: llvmdev at cs.uiuc.edu Mailing List
Subject: Re: [LLVMdev] LLVM Loop
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
If -simd option is specified opt could do validity checks, dependency analysis and such and recognize that a loop can be executed in parallel and as the -simd option is specified, convert the data types to vector instructions and add the scaling factor to the loop's iterators. Following this there can be an early machine function pass that sets up processor specific value in all of
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "llvmdev at cs.uiuc.edu Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Friday, October 5, 2012 1:14:47 AM
> Subject: [LLVMdev] LLVM Loop Vectorizer
>
> Hi,
>
> We are starting to work on an LLVM loop vectorizer. There's number of
> different projects that
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
>I think that the first step would be to expose Target Lowering
Interface (TLI) to OPT's IR-level passes.
By "lowering", we assume the bitcode is more abstract than the machine
code. However, in some situations, it is just opposite. For instance,
some architectures support vectorization of
min/max/saturated-{add,sub)/conditional-assignment/etc/../etc. We need
to detect such
2012 Oct 05
0
[LLVMdev] LLVM Loop Vectorizer
Nadav Rotem wrote:
> Hi,
>
> We are starting to work on an LLVM loop vectorizer. There's number of different projects that already vectorize LLVM IR. For example Hal's BB-Vectorizer, Intel's OpenCL Vectorizer, Polly, ISPC, AnySL, just to name a few. I think that it would be great if we could collaborate on the areas that are shared between the different projects. I think that
2013 Oct 25
2
[LLVMdev] Bug #16941
Nadav,
The problem appears only for vectors longer than available hardware
register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8
on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers,
select converts them to a single XMM registers (i.e. 8 x 16 bit),
immediately after it converts back to two XMM registers and does blend.
Conversion forth and back has
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
You are absolutely right, it's ISPC workload. I've checked SSE4 and it's
also severely affected.
We use intrinsics only for conversion <N x i32> <=> i32, i.e. movmsk.ps.
For the rest we use general LLVM instructions. And I actually would really
like to stick this way. We rely on LLVM's ability to produce efficient code
from general LLVM IR. Relying on
2013 Oct 26
0
[LLVMdev] Bug #16941
Hi Dmitry,
Yes, this is a known problem with legalizing vector masks. The type <8 x i1> is legalized to 8 x i16, on SSE, but your operands are legalized to <4 x i32>. Type-legalization is performed per-node and we don’t have a good way to support instructions that mix the mask and operand type. Why does ISPC generate illegal vector types ? Does ISPC rely on the LLVM codegen to
2013 Oct 21
0
[LLVMdev] Bug #16941
Hi Dmitry,
ISPC does some instruction selection as part of vectorization (on ASTs!) by placing intrinsics for specific operations. The SEXT to i32 pattern was implemented because LLVM did not support vector-selects when this code was written.
Can you submit a small SSE4 test case that demonstrates the problem? Select is the canonical form of this operations, and SEXT is usually more
2013 Oct 26
1
[LLVMdev] Bug #16941
Hi Nadav,
ISPC is generating long vectors (on corresponding ISPC targets) this way
since the every beginning of ISPC as far as I know. There's no such things
in official LLVM documents as "illegal vectors", so people do expect that
arbitrary long vectors are supported and generated reasonably well. Note,
not super-optimal, but reasonably well. Keeping it this way allows
considering
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
You are right, ISPC may issue intrinsics as a result of AST selection.
Though I believe that we should stick to LLVM IR whenever is possible.
Intrinsics may appear to be boundaries for optimizations (on both data and
control flow) and are generally not optimizable. LLVM may improve over time
from performance stand point and we would benefit from it (or it may play
against us, like in this
2011 Nov 23
3
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
Duncan,
Thanks for the quick review! Here is a short description (design) of where I am going with this patch:
1. Motivation: Vectors-of-pointers is the first step in supporting scatter/gather instructions (available in AVX2, for example). I believe that this feature was requested on the mailing list before. As mentioned by Hal Finkel earlier today, this feature is desired by autovectorizers as
2013 Jun 02
4
[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn
Hi Jack, thanks for splitting out what the effects of LLVM's / GCC's vectorizers
is.
On 01/06/13 21:34, Jack Howarth wrote:
> On Sat, Jun 01, 2013 at 06:45:48AM +0200, Duncan Sands wrote:
>>
>> These results are very disappointing, I was hoping to see a big improvement
>> somewhere instead of no real improvement anywhere (except for gas_dyn) or a
>> regression
2012 Oct 05
6
[LLVMdev] LLVM Loop Vectorizer
On Oct 5, 2012, at 12:08 AM, Nick Lewycky <nicholas at mxc.ca> wrote:
> I absolutely think that we should have something like TargetData (now DataLayout) but for the vector types and operations. However, I'm not familiar with "Target Lowering Interface". Could you explain?
I agree. Once we make the codegen accessible to the IR-level passes we need to start talking about
2013 Feb 21
2
[LLVMdev] Generate scalar SSE instructions instead of packed instructions
On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote:
> You can change the input LLVM-IR.
>
> On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com>
> wrote:
>
> Hi,****
>
> ** **
>
> I am interested in evaluating the performance of packed vs scalar
> double-precision floating point instructions on
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
Could you please have a look at bug #16941 and let us know what you think
about it? It's performance regression after one of your commits.
Thanks.
Dmitry.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131021/036e81d6/attachment.html>