similar to: [LLVMdev] Case where VSETCC DAGCombiner hack doesn't work

Displaying 20 results from an estimated 500 matches similar to: "[LLVMdev] Case where VSETCC DAGCombiner hack doesn't work"

2012 Jul 27
0
[LLVMdev] TLI.getSetCCResultType() and/or MVT broken by design?
We no longer have vsetcc, so the comment is wrong. The code looks incorrect. The fact that a vector is power-of-two does not guarantee anything about its legality. For example <128 x i64> would pass the condition in the code below, and die on most targets. From: Villmow, Micah [mailto:Micah.Villmow at amd.com] Sent: Friday, July 27, 2012 22:33 To: Rotem, Nadav; Developers Mailing List
2013 Mar 05
4
[LLVMdev] Vector splitting vs widening
Hello, Working on my (currently out-of-tree) BG/Q PPC enhancements, I've run into the following problem with vector type legalization. Here's a quick example: Scalarize node result 0: 0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0] Scalarize node result 0: 0x2348220: v1f32 = extract_subvector 0x23434a0, 0x23466e0 [ID=0] Split node result: 0x23469e0: v4f32 =
2012 Jul 27
2
[LLVMdev] TLI.getSetCCResultType() and/or MVT broken by design?
if (N0.getOpcode() == ISD::SETCC && (LegalOperations || (!LegalOperations && VT.isPow2VectorType()))) But the comment right after it is: // sext(setcc) -> sext_in_reg(vsetcc) for vectors. // Only do this before legalize for now. if (VT.isVector() && !LegalOperations) { So, these optimizations are never safe in the general case if we can't
2013 Mar 06
0
[LLVMdev] Vector splitting vs widening
Hi Hal, > The problem is essentially the following: there are no vector f32 types (yet), so the <v4i1> = setcc <v4f32> node needs to be split and scalarized. The operand splitting seems to start correctly, but because <v4i1> is itself a legal type, after splitting the node into <v2i1> = setcc <v2f32>, the process becomes confused. The operands are again split
2013 Mar 09
1
[LLVMdev] Vector splitting vs widening
----- Original Message ----- > From: "Nadav Rotem" <nrotem at apple.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "llvmdev at cs.uiuc.edu Dev" <llvmdev at cs.uiuc.edu> > Sent: Wednesday, March 6, 2013 3:40:50 PM > Subject: Re: [LLVMdev] Vector splitting vs widening > > Hi Hal, > > > > > > > The
2013 Mar 05
0
[LLVMdev] Vector splitting vs widening
Hi Hal, On 05/03/13 18:50, Hal Finkel wrote: > Hello, > > Working on my (currently out-of-tree) BG/Q PPC enhancements, I've run into the following problem with vector type legalization. Here's a quick example: > > Scalarize node result 0: 0x2348420: v1f32 = extract_subvector 0x23434a0, 0x2348320 [ID=0] > > Scalarize node result 0: 0x2348220: v1f32 = extract_subvector
2013 Feb 14
1
[LLVMdev] LiveIntervals analysis problem
Hello everyone, please I need your help. To reproduce my problem I created simple pass for backends (TestPass.cpp in attached files). That pass I call from Mips backend in this way (MipsTargetMachine.cpp): bool MipsPassConfig::addPreRegAlloc() { addPass(createTestPass()); return false; } The problem becomes, when I am trying compile file ldtoa.ll (in attached files). Compiling
2009 Feb 19
3
[LLVMdev] Possible DAGCombiner or TargetData Bug
I got bit by this in LLVM 2.4 DagCombiner.cpp and it's still in trunk: SDValue DAGCombiner::visitSTORE(SDNode *N) { [...] // If this is a store of a bit convert, store the input value if the // resultant store does not need a higher alignment than the original. if (Value.getOpcode() == ISD::BIT_CONVERT && !ST->isTruncatingStore() && ST->isUnindexed()) {
2009 Feb 19
0
[LLVMdev] Possible DAGCombiner or TargetData Bug
I agree, that doesn't look right. It looks like this is what was intended: Index: lib/CodeGen/SelectionDAG/DAGCombiner.cpp =================================================================== --- lib/CodeGen/SelectionDAG/DAGCombiner.cpp (revision 65000) +++ lib/CodeGen/SelectionDAG/DAGCombiner.cpp (working copy) @@ -4903,9 +4903,9 @@ // resultant store does not need a higher alignment than
2016 May 31
2
[PATCH 1/2] Modify autoconf tests for intrinsics to stop clang from optimizing them away.
--- configure.ac | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/configure.ac b/configure.ac index a67aa37..c722556 100644 --- a/configure.ac +++ b/configure.ac @@ -472,6 +472,7 @@ AS_IF([test x"$enable_intrinsics" = x"yes"],[ [[ static float32x4_t A0, A1, SUMM; SUMM = vmlaq_f32(SUMM, A0,
2009 May 20
2
[LLVMdev] [PATCH] Add new phase to legalization to handle vector operations
Per subject, this patch adding an additional pass to handle vector operations; the idea is that this allows removing the code from LegalizeDAG that handles illegal types, which should be a significant simplification. There are still some issues with this patch, but does the approach look sane? -Eli -------------- next part -------------- Index: lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp
2009 Oct 26
1
[PATCH] Fix miscompile of SSE resampler
From: Thorvald Natvig <slicer at users.sourceforge.net> Some optimizing compilers miscompile the current SSE optimizations when full optimizations are enabled. By using output value pointer instead of a return value, we can bypass this misbehaviour. --- libspeex/resample.c | 8 ++++---- libspeex/resample_sse.h | 24 ++++++++---------------- 2 files changed, 12 insertions(+), 20
2015 Mar 13
1
[RFC PATCH v3] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2015 Mar 12
1
[RFC PATCHv2] Intrinsics/RTCD related fixes. Mostly x86.
From: Jonathan Lennox <jonathan at vidyo.com> * Makes ?enable-intrinsics work with clang and other non-GCC compilers * Enables RTCD for the floating-point-mode SSE code in Celt. * Disables use of RTCD in cases where the compiler targets an instruction set by default. * Enables the SSE4.1 Silk optimizations that apply to the common parts of Silk when Opus is built in floating-point mode, not
2014 Oct 13
2
[LLVMdev] Unexpected spilling of vector register during lane extraction on some x86_64 targets
Hello, Depending on how I extract integer lanes from an x86_64 xmm register, the backend may spill that register in order to load scalars. The effect was observed on two targets: corei7-avx and btver1 (I haven't checked other targets). Here's a test case with spilling/no-spilling code put on conditional compile: #if __SSE4_1__ != 0 #include <smmintrin.h> #else #include
2012 Jul 27
0
[LLVMdev] TLI.getSetCCResultType() and/or MVT broken by design?
Hi Micah, I think that getSetCCResultType should only be called for legal types. Disabling it on isPow2VectorType is not the way to go because there are other illegal vector types which are pow-of-two. I suggest that you call it only after type-legalization. BTW, you can't set the LLVMTy yourself because you don't have access to the LLVMContext at that point. Nadav From:
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:47 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > > Yes this IR does not build or shuffle any vector. Try to write a function that takes 8 ints and a pointer to a <4xi32>, builds two vectors with the 8 ints, > > This might sound like a dumb question, but how does one build a vector of ints out of regular ints in IR? See:
2016 Mar 18
2
generate vectorized code
> On Mar 18, 2016, at 1:37 PM, Rail Shafigulin <rail at esenciatech.com> wrote: > >> I think you created a cycle, this is easy to do with SelectionDAG :) >> Basically SelecitonDAG will iterate until it does not see anything to change. So if you insert a transformation on a pattern A, that generates pattern B, while you have another transformation that matches B and
2017 Sep 15
2
What should a truncating store do?
OK, I'm clear on scalars. Data races are thankfully OK in this context. Densely packing vectors sounds efficient and is clear in the case where lanes * width is a multiple of 8 bits. I don't think I understand how it works in other cases. If we could take store <4 x i8> truncating to <4 x i7> as an example. This can be converted into four scalar i8 -> i7 stores with
2018 Mar 20
1
Polly -polly-prevect-width
i musing polly with vec-width=16 default my IR emits <16xi32> and remaining as <4xi32> by using polly. I want my IR to emit <16xi32> and remaining left as <8xi32>. How to do this? i m trying to use -polly-prevect-width. please help. -------------- next part -------------- An HTML attachment was scrubbed... URL: