Displaying 20 results from an estimated 6000 matches similar to: "[LLVMdev] Bug #16941"
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
You are absolutely right, it's ISPC workload. I've checked SSE4 and it's
also severely affected.
We use intrinsics only for conversion <N x i32> <=> i32, i.e. movmsk.ps.
For the rest we use general LLVM instructions. And I actually would really
like to stick this way. We rely on LLVM's ability to produce efficient code
from general LLVM IR. Relying on
2013 Oct 21
0
[LLVMdev] Bug #16941
Hi Dmitry,
ISPC does some instruction selection as part of vectorization (on ASTs!) by placing intrinsics for specific operations. The SEXT to i32 pattern was implemented because LLVM did not support vector-selects when this code was written.
Can you submit a small SSE4 test case that demonstrates the problem? Select is the canonical form of this operations, and SEXT is usually more
2013 Oct 21
0
[LLVMdev] Bug #16941
Hi Dmitry.
This looks like an ISPC workload. ISPC works around a limitation in selection dag which does not know how to legalize mask types when both 128 and 256 bit registers are available. ISPC works around this problem by expanding the mask to i32s and using intrinsics. Can you please verify that this regression only happens on AVX ? Can you change ISPC to use intrinsics ?
Thanks
Nadav
Sent
2013 Oct 21
2
[LLVMdev] Bug #16941
Nadav,
You are right, ISPC may issue intrinsics as a result of AST selection.
Though I believe that we should stick to LLVM IR whenever is possible.
Intrinsics may appear to be boundaries for optimizations (on both data and
control flow) and are generally not optimizable. LLVM may improve over time
from performance stand point and we would benefit from it (or it may play
against us, like in this
2013 Oct 25
2
[LLVMdev] Bug #16941
Nadav,
The problem appears only for vectors longer than available hardware
register (in doubleword elements, i.e. more than 4 on SSE4 and more than 8
on AVX). Select does weird thing. <8 x i1> mask comes as two XMM registers,
select converts them to a single XMM registers (i.e. 8 x 16 bit),
immediately after it converts back to two XMM registers and does blend.
Conversion forth and back has
2013 Oct 26
0
[LLVMdev] Bug #16941
Hi Dmitry,
Yes, this is a known problem with legalizing vector masks. The type <8 x i1> is legalized to 8 x i16, on SSE, but your operands are legalized to <4 x i32>. Type-legalization is performed per-node and we don’t have a good way to support instructions that mix the mask and operand type. Why does ISPC generate illegal vector types ? Does ISPC rely on the LLVM codegen to
2013 Oct 26
1
[LLVMdev] Bug #16941
Hi Nadav,
ISPC is generating long vectors (on corresponding ISPC targets) this way
since the every beginning of ISPC as far as I know. There's no such things
in official LLVM documents as "illegal vectors", so people do expect that
arbitrary long vectors are supported and generated reasonably well. Note,
not super-optimal, but reasonably well. Keeping it this way allows
considering
2013 Oct 22
0
[LLVMdev] Bug #16941
On Oct 21, 2013, at 12:09 PM, Dmitry Babokin <babokin at gmail.com> wrote:
> By the way, I'm curious, is the any reason why you focus on SSE4, not AVX? Seems that vectorizer should care the most about the latest silicon.
>
I am interested in looking at the SSE4 code because lowering of AVX code is more complicated, especially for masks. The problem that <8 x i1> can be
2013 Oct 21
0
[LLVMdev] LLVMdev Digest, Vol 112, Issue 56
Has anyone worked with or used the LLVM backend or compiler for Haskell ??
David
On Monday, October 21, 2013 5:26 PM, "llvmdev-request at cs.uiuc.edu" <llvmdev-request at cs.uiuc.edu> wrote:
Send LLVMdev mailing list submissions to
llvmdev at cs.uiuc.edu
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
or,
2013 Jul 10
2
[LLVMdev] unaligned AVX store gets split into two instructions
I've narrowed this down to a single kernel (kernel.ll), which does a
fixed-size matrix-matrix multiply:
# ~/llvm-32-final/bin/llc kernel.ll -o kernel32.s
# ~/llvm-33-final/bin/llc kernel.ll -o kernel33.s
# ~/llvm-32-final/bin/clang++ harness.cpp kernel32.s -o harness32
# ~/llvm-32-final/bin/clang++ harness.cpp kernel33.s -o harness33
# time ./harness32
real 0m0.584s
user 0m0.581s
sys 0m0.001s
2013 Sep 19
0
[LLVMdev] unaligned AVX store gets split into two instructions
Nadav,
We see multiple regressions after r172868 in ISPC compiler (based on LLVM
optimizer). The regressions are due to spill/reloads, which are due to
increase register pressure. This matches Zach's analysis. We've filed bug
17285 for this problem.
Is there any possibility to avoid splitting in case of multiple loads going
together?
Dmitry.
On Wed, Jul 10, 2013 at 1:12 PM, Zach
2013 Oct 11
2
[LLVMdev] "target-features" and "target-cpu" attributes
Looking forward to these changes! Thanks for working on it.
On Fri, Oct 11, 2013 at 10:32 PM, Bill Wendling <isanbard at gmail.com> wrote:
> Hi Dmitry,
>
> I can try my best, but it would be a bit tricky to get it all finished by
> then...
>
> -bw
>
> On Oct 11, 2013, at 4:10 AM, Dmitry Babokin <babokin at gmail.com> wrote:
>
> Bill,
>
> Are there
2013 Oct 11
2
[LLVMdev] "target-features" and "target-cpu" attributes
Bill,
Are there any chances that you complete it before 3.4 is branched?
On Thu, Oct 10, 2013 at 10:16 PM, Bill Wendling <isanbard at gmail.com> wrote:
> On Oct 10, 2013, at 4:22 AM, Dmitry Babokin <babokin at gmail.com> wrote:
>
> > Bill,
> >
> > Thanks for answering. To make sure that we are on the same page, let's
> agree on definitions :) Here, by
2013 Oct 03
2
[LLVMdev] "target-features" and "target-cpu" attributes
Bill, Ben, everyone,
Some time ago "target-features" and "target-cpu" attributes were
introduced. As I understand, they are intended to support generation of
"fat binaries" (binaries with functions generated for different CPUs),
particularly to support LTO compilation, when different source files have
different targets (say, one of files should support SSE2, another
2013 Oct 12
0
[LLVMdev] "target-features" and "target-cpu" attributes
FYI:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-October/066389.html
Please read and let me know you comments.
-bw
On Oct 11, 2013, at 2:47 PM, Dmitry Babokin <babokin at gmail.com> wrote:
> Looking forward to these changes! Thanks for working on it.
>
>
> On Fri, Oct 11, 2013 at 10:32 PM, Bill Wendling <isanbard at gmail.com> wrote:
> Hi Dmitry,
>
> I
2013 Oct 10
2
[LLVMdev] "target-features" and "target-cpu" attributes
Bill,
Thanks for answering. To make sure that we are on the same page, let's
agree on definitions :) Here, by fat binaries I mean the binary, where some
functions are compiled for one flavor of x86, while others are compiled for
another flavor of x86. I care about the usage model, which is important for
LTO - a dispatch function (compiled for the least common denominator) +
plus set of
2013 Oct 11
0
[LLVMdev] "target-features" and "target-cpu" attributes
Hi Dmitry,
I can try my best, but it would be a bit tricky to get it all finished by then...
-bw
On Oct 11, 2013, at 4:10 AM, Dmitry Babokin <babokin at gmail.com> wrote:
> Bill,
>
> Are there any chances that you complete it before 3.4 is branched?
>
>
> On Thu, Oct 10, 2013 at 10:16 PM, Bill Wendling <isanbard at gmail.com> wrote:
> On Oct 10, 2013, at
2013 Oct 09
0
[LLVMdev] "target-features" and "target-cpu" attributes
On Oct 3, 2013, at 9:34 AM, Dmitry Babokin <babokin at gmail.com> wrote:
> Bill, Ben, everyone,
>
> Some time ago "target-features" and "target-cpu" attributes were introduced. As I understand, they are intended to support generation of "fat binaries" (binaries with functions generated for different CPUs), particularly to support LTO compilation, when
2013 Oct 10
0
[LLVMdev] "target-features" and "target-cpu" attributes
On Oct 10, 2013, at 4:22 AM, Dmitry Babokin <babokin at gmail.com> wrote:
> Bill,
>
> Thanks for answering. To make sure that we are on the same page, let's agree on definitions :) Here, by fat binaries I mean the binary, where some functions are compiled for one flavor of x86, while others are compiled for another flavor of x86. I care about the usage model, which is
2012 Oct 05
12
[LLVMdev] LLVM Loop Vectorizer
Hi,
We are starting to work on an LLVM loop vectorizer. There's number of different projects that already vectorize LLVM IR. For example Hal's BB-Vectorizer, Intel's OpenCL Vectorizer, Polly, ISPC, AnySL, just to name a few. I think that it would be great if we could collaborate on the areas that are shared between the different projects. I think that refactoring LLVM in away that