Displaying 20 results from an estimated 500 matches similar to: "[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work"
2012 Jul 27
0
[LLVMdev] TLI.getSetCCResultType() and/or MVT broken by design?
We no longer have vsetcc, so the comment is wrong. The code looks incorrect. The fact that a vector is power-of-two does not guarantee anything about its legality. For example <128 x i64> would pass the condition in the code below, and die on most targets.
From: Villmow, Micah [mailto:Micah.Villmow at amd.com]
Sent: Friday, July 27, 2012 22:33
To: Rotem, Nadav; Developers Mailing List
2013 Mar 05
4
[LLVMdev] Vector splitting vs widening
Hello,
Working on my (currently out-of-tree) BG/Q PPC enhancements, I've run into the following problem with vector type legalization. Here's a quick example:
Scalarize node result 0: 0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0]
Scalarize node result 0: 0x2348220: v1f32 = extract_subvector 0x23434a0, 0x23466e0 [ID=0]
Split node result: 0x23469e0: v4f32 =
2012 Jul 27
2
[LLVMdev] TLI.getSetCCResultType() and/or MVT broken by design?
if (N0.getOpcode() == ISD::SETCC
&& (LegalOperations
|| (!LegalOperations && VT.isPow2VectorType())))
But the comment right after it is:
// sext(setcc) -> sext_in_reg(vsetcc) for vectors.
// Only do this before legalize for now.
if (VT.isVector() && !LegalOperations) {
So, these optimizations are never safe in the general case if we can't
2013 Mar 06
0
[LLVMdev] Vector splitting vs widening
Hi Hal,
> The problem is essentially the following: there are no vector f32 types (yet), so the <v4i1> = setcc <v4f32> node needs to be split and scalarized. The operand splitting seems to start correctly, but because <v4i1> is itself a legal type, after splitting the node into <v2i1> = setcc <v2f32>, the process becomes confused. The operands are again split
2013 Mar 09
1
[LLVMdev] Vector splitting vs widening
----- Original Message -----
> From: "Nadav Rotem" <nrotem at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu>
> Sent: Wednesday, March 6, 2013 3:40:50 PM
> Subject: Re: [LLVMdev] Vector splitting vs widening
>
> Hi Hal,
>
>
>
>
>
>
> The
2013 Mar 05
0
[LLVMdev] Vector splitting vs widening
Hi Hal,
On 05/03/13 18:50, Hal Finkel wrote:
> Hello,
>
> Working on my (currently out-of-tree) BG/Q PPC enhancements, I've run into the following problem with vector type legalization. Here's a quick example:
>
> Scalarize node result 0: 0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0]
>
> Scalarize node result 0: 0x2348220: v1f32 = extract_subvector
2013 Feb 14
1
[LLVMdev] LiveIntervals analysis problem
Hello everyone,
please I need your help.
To reproduce my problem I created simple pass for backends (TestPass.cpp
in attached files). That pass I call from Mips backend in this way
(MipsTargetMachine.cpp):
bool MipsPassConfig::addPreRegAlloc() {
addPass(createTestPass());
return false;
}
The problem becomes, when I am trying compile file ldtoa.ll (in attached
files). Compiling
2009 Feb 19
3
[LLVMdev] Possible DAGCombiner or TargetData Bug
I got bit by this in LLVM 2.4 DagCombiner.cpp and it's still in trunk:
SDValue DAGCombiner::visitSTORE(SDNode *N) {
[...]
// If this is a store of a bit convert, store the input value if the
// resultant store does not need a higher alignment than the original.
if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() &&
ST->isUnindexed()) {
2009 Feb 19
0
[LLVMdev] Possible DAGCombiner or TargetData Bug
I agree, that doesn't look right. It looks like this
is what was intended:
Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp
===================================================================
--- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000)
+++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy)
@@ -4903,9 +4903,9 @@
// resultant store does not need a higher alignment than
2016 May 31
2
[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.
---
configure.ac | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)
diff --git a/configure.ac b/configure.ac
index a67aa37..c722556 100644
--- a/configure.ac
+++ b/configure.ac
@@ -472,6 +472,7 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[
[[
static float32x4_t A0, A1, SUMM;
SUMM = vmlaq_f32(SUMM, A0,
2009 May 20
2
[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations
Per subject, this patch adding an additional pass to handle vector
operations; the idea is that this allows removing the code from
LegalizeDAG that handles illegal types, which should be a significant
simplification. There are still some issues with this patch, but does
the approach look sane?
-Eli
-------------- next part --------------
Index: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net>
Some optimizing compilers miscompile the current SSE optimizations when
full optimizations are enabled. By using output value pointer instead of
a return value, we can bypass this misbehaviour.
---
libspeex/resample.c | 8 ++++----
libspeex/resample_sse.h | 24 ++++++++----------------
2 files changed, 12 insertions(+), 20
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com>
* Makes ?enable-intrinsics work with clang and other non-GCC compilers
* Enables RTCD for the floating-point-mode SSE code in Celt.
* Disables use of RTCD in cases where the compiler targets an instruction set by default.
* Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com>
* Makes ?enable-intrinsics work with clang and other non-GCC compilers
* Enables RTCD for the floating-point-mode SSE code in Celt.
* Disables use of RTCD in cases where the compiler targets an instruction set by default.
* Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
Hello,
Depending on how I extract integer lanes from an x86_64 xmm register, the
backend may spill that register in order to load scalars. The effect was
observed on two targets: corei7-avx and btver1 (I haven't checked other
targets).
Here's a test case with spilling/no-spilling code put on conditional
compile:
#if __SSE4_1__ != 0
#include <smmintrin.h>
#else
#include
2012 Jul 27
0
[LLVMdev] TLI.getSetCCResultType() and/or MVT broken by design?
Hi Micah,
I think that getSetCCResultType should only be called for legal types. Disabling it on isPow2VectorType is not the way to go because there are other illegal vector types which are pow-of-two. I suggest that you call it only after type-legalization.
BTW, you can't set the LLVMTy yourself because you don't have access to the LLVMContext at that point.
Nadav
From:
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> wrote:
>
> Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints,
>
> This might sound like a dumb question, but how does one build a vector of ints out of regular ints in IR?
See:
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:37 PM, Rail Shafigulin <rail at esenciatech.com> wrote:
>
>> I think you created a cycle, this is easy to do with SelectionDAG :)
>> Basically SelecitonDAG will iterate until it does not see anything to change. So if you insert a transformation on a pattern A, that generates pattern B, while you have another transformation that matches B and
2017 Sep 15
2
What should a truncating store do?
OK, I'm clear on scalars. Data races are thankfully OK in this context.
Densely packing vectors sounds efficient and is clear in the case where
lanes * width is a multiple of 8 bits. I don't think I understand how it
works in other cases.
If we could take store <4 x i8> truncating to <4 x i7> as an example. This
can be converted into four scalar i8 -> i7 stores with
2018 Mar 20
1
Polly -polly-prevect-width
i musing polly with vec-width=16 default my IR emits <16xi32> and remaining
as <4xi32> by using polly. I want my IR to emit <16xi32> and remaining left
as <8xi32>. How to do this?
i m trying to use -polly-prevect-width.
please help.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: